Commit graph

1660 commits

Author SHA1 Message Date
Thomas Gleixner
f0c7baca18 genirq/affinity: Make affinity setting if activated opt-in
John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:

 "It looks like what happens is that because the interrupts are not per-CPU
  in the hardware, armpmu_request_irq() calls irq_force_affinity() while
  the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
  IRQF_NOBALANCING.  

  Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
  irq_setup_affinity() which returns early because IRQF_PERCPU and
  IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."

This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.

Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...

Fixes: baedb87d1b ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
2020-07-27 16:20:40 +02:00
John Stultz
8d16f5b979 genirq: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent
Add EXPORT_SYMBOL_GPL entries for irq_chip_retrigger_hierarchy()
and irq_chip_set_vcpu_affinity_parent() so that we can allow
drivers like the qcom-pdc driver to be loadable as a module.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-arm-msm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-gpio@vger.kernel.org
Link: https://lore.kernel.org/r/20200710231824.60699-3-john.stultz@linaro.org
2020-07-27 08:55:03 +01:00
John Stultz
8a667928cf irqdomain: Export irq_domain_update_bus_token
Add export for irq_domain_update_bus_token() so that
we can allow drivers like the qcom-pdc driver to be
loadable as a module.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Andy Gross <agross@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Maulik Shah <mkshah@codeaurora.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-arm-msm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-gpio@vger.kernel.org
Link: https://lore.kernel.org/r/20200710231824.60699-2-john.stultz@linaro.org
2020-07-27 08:55:02 +01:00
Zenghui Yu
45e9504f10 genirq/irqdomain: Remove redundant NULL pointer check on fwnode
The is_fwnode_irqchip() helper will check if the fwnode_handle is empty.
There is no need to perform a redundant check outside of it.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200716083905.287-1-yuzenghui@huawei.com
2020-07-27 08:55:02 +01:00
Thomas Gleixner
baedb87d1b genirq/affinity: Handle affinity setting on inactive interrupts correctly
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.

X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.

Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:

    - Store it in the irq descriptors affinity mask
    - Update the effective affinity to reflect that so user space has
      a consistent view
    - Don't call into the irq chip driver

This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.

Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.

For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.

Remove the X86 workaround as it is not longer required.

Fixes: 02edee152d ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
2020-07-17 23:30:43 +02:00
Valentin Schneider
8fa88a88d5 genirq: Remove preflow handler support
That was put in place for sparc64, and blackfin also used it for some time;
sparc64 no longer uses those, and blackfin is dead.

As there are no more users, remove preflow handlers.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200703155645.29703-3-valentin.schneider@arm.com
2020-07-04 10:02:06 +02:00
Peter Zijlstra
7a40798c71 sched,irq: Convert to sched_set_fifo()
Because SCHED_FIFO is a broken scheduler model (see previous patches)
take away the priority field, the kernel can't possibly make an
informed decision.

Effectively no change.

Cc: tglx@linutronix.de
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
2020-06-15 14:10:24 +02:00
Masahiro Yamada
a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Thomas Gleixner
76fe06c1e6 irqchip updates for Linux 5.8:
- A few new drivers for the Loongson MIPS platform (HTVEC, PIC, MSI)
 - A cleanup of the __irq_domain_add() API
 - A cleanup of the IRQ simulator to actually use some of
   the irq infrastructure
 - Some fixes for the Sifive PLIC when used in a multi-controller
   context
 - Fixes for the GICv3 ITS to spread interrupts according to the
   load of each CPU, and to honor managed interrupts
 - Numerous cleanups and documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl7Q/OQPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpD/vQQAKnqdQ/RD1anm7mHNXzzbEnd9zqoke7er1EM
 kkZioYJ7KAFEc5SMHiyWwoAfxBvJ1hh/9d48N3dXVdNNaBezpS1uSyviUP92ofzL
 ZS+aEEyPA3CLAkYzJKmMUQ2LrLpdMc4QM7G+nHgquk0UBT1aNpduDoMRqToPpJ5C
 ZXfVjhZ8gaM//sueB2aRZZUGmu8lAqXiNBUn+encBFfCVp1iydBlumwD//viTD/g
 BMO956DAfzoVqb/0n/ZqdftbNR5gtb8wb5POH8I36HOfmqUIdF78OuRhzCm/Odqf
 uV/eRQ4tDgMVM5PuHNYACGax2DWRRlHVK1wCXcEgj9Wh1p2a7moPOq0zZuREQaEw
 4DyIi2VeK1MDkWv4NrPSuVupicwAioha9wVeSnfm9SkKAo1GEj5gHs49MUKas11s
 ikVLzYCenRfdviu54P0axI1x5HCTvPTXuosPR8Sn7kMQhxQuZxMrYo+9X8WudVOU
 oEPmK1khRQaU+sf2+wPWvwWt6LMcVzTwWWsXarhWDN1H+budTulpxKsfb0b0rxCp
 2viQJ+ncCks1usFrsmuD0bqxyJShDgqFMz46z0R6PYYO/4bJFU5t4UVjIsW1bJZV
 Vrnkkf9Aw6OiAV5eR7hsRDlYw8tRGm8CZiLRr3szVu4Py9RC4KdoprPf3xn78qqV
 TIujPzK6
 =RX31
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

 - A few new drivers for the Loongson MIPS platform (HTVEC, PIC, MSI)
 - A cleanup of the __irq_domain_add() API
 - A cleanup of the IRQ simulator to actually use some of
   the irq infrastructure
 - Some fixes for the Sifive PLIC when used in a multi-controller
   context
 - Fixes for the GICv3 ITS to spread interrupts according to the
   load of each CPU, and to honor managed interrupts
 - Numerous cleanups and documentation fixes
2020-05-30 09:40:12 +02:00
Marek Vasut
1d0326f352 genirq: Check irq_data_get_irq_chip() return value before use
irq_data_get_irq_chip() can return NULL, however it is expected that this
never happens. If a buggy driver leads to NULL being returned from
irq_data_get_irq_chip(), warn about it instead of crashing the machine.

Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

To: linux-arm-kernel@lists.infradead.org
2020-05-28 15:58:04 +02:00
Andy Shevchenko
9ed78b05f9 irqdomain: Allow software nodes for IRQ domain creation
In some cases we need to have an IRQ domain created out of software node.

One of such cases is DesignWare GPIO driver when it's instantiated from
half-baked ACPI table (alas, we can't fix it for devices which are few years
on market) and thus using software nodes to quirk this. But the driver
is using IRQ domains based on per GPIO port firmware nodes, which are in
the above case software ones. This brings a warning message to be printed

  [   73.957183] irq: Invalid fwnode type for irqdomain

and creates an anonymous IRQ domain without a debugfs entry.

Allowing software nodes to be valid for IRQ domains rids us of the warning
and debugs gets correctly populated.

  % ls -1 /sys/kernel/debug/irq/domains/
  ...
  intel-quark-dw-apb-gpio:portA

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[maz: refactored commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200520164927.39090-3-andriy.shevchenko@linux.intel.com
2020-05-21 10:53:17 +01:00
Andy Shevchenko
87526603c8 irqdomain: Get rid of special treatment for ACPI in __irq_domain_add()
Now that __irq_domain_add() is able to better deals with generic
fwnodes, there is no need to special-case ACPI anymore.

Get rid of the special treatment for ACPI.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200520164927.39090-2-andriy.shevchenko@linux.intel.com
2020-05-21 10:51:50 +01:00
Andy Shevchenko
181e9d4efa irqdomain: Make __irq_domain_add() less OF-dependent
__irq_domain_add() relies in some places on the fact that the fwnode
can be only of type OF. This prevents refactoring of the code to support
other types of fwnode. Make it less OF-dependent by switching it
to use the fwnode directly where it makes sense.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200520164927.39090-1-andriy.shevchenko@linux.intel.com
2020-05-21 10:50:30 +01:00
Bartosz Golaszewski
337cbeb2c1 genirq/irq_sim: Simplify the API
The interrupt simulator API exposes a lot of custom data structures and
functions and doesn't reuse the interfaces already exposed by the irq
subsystem. This patch tries to address it.

We hide all the simulator-related data structures from users and instead
rely on the well-known irq domain. When creating the interrupt simulator
the user receives a pointer to a newly created irq_domain and can use it
to create mappings for simulated interrupts.

It is also possible to pass a handle to fwnode when creating the simulator
domain and retrieve it using irq_find_matching_fwnode().

The irq_sim_fire() function is dropped as well. Instead we implement the
irq_get/set_irqchip_state interface.

We modify the two modules that use the simulator at the same time as
adding these changes in order to reduce the intermediate bloat that would
result when trying to migrate the drivers in separate patches.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> #for IIO
Link: https://lore.kernel.org/r/20200514083901.23445-3-brgl@bgdev.pl
2020-05-18 10:30:21 +01:00
Bartosz Golaszewski
5c8f77a278 irqdomain: Make irq_domain_reset_irq_data() available to non-hierarchical users
irq_domain_reset_irq_data() doesn't modify the parent data, so it can be
made available even if irq domain hierarchy is not being built. We'll
subsequently use it in irq_sim code.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20200514083901.23445-2-brgl@bgdev.pl
2020-05-18 10:29:26 +01:00
afzal mohammed
07d8350ede genirq: Remove setup_irq() and remove_irq()
Now that all the users of setup_irq() & remove_irq() have been replaced by
request_irq() & free_irq() respectively, delete them.

Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lkml.kernel.org/r/0aa8771ada1ac8e1312f6882980c9c08bd023148.1585320721.git.afzal.mohd.ma@gmail.com
2020-04-14 10:08:50 +02:00
Linus Torvalds
d5f744f9a2 x86 entry code updates:
- Convert the 32bit syscalls to be pt_regs based which removes the
       requirement to push all 6 potential arguments onto the stack and
       consolidates the interface with the 64bit variant
 
     - The first small portion of the exception and syscall related entry
       code consolidation which aims to address the recently discovered
       issues vs. RCU, int3, NMI and some other exceptions which can
       interrupt any context. The bulk of the changes is still work in
       progress and aimed for 5.8.
 
     - A few lockdep namespace cleanups which have been applied into this
       branch to keep the prerequisites for the ongoing work confined.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6B/TMTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYA6EAC7r/bCMxBelljT3b7LkBbiJcocJ+zK
 OSzWU9miJGTAvYqn4/ciLKg4dA424b/1rBFlF1hBTCQ0HL5Cv4lajxdKEZCO5WCC
 WWTCz+MC60aWFaH3VNoywiLGb39H2IbqWbS9yNPd/wBkLHiMAD6NPQntOvcPaD4j
 1lyrMtLzfrWlrHxvxdI3kt5ZpFLYNXr2xk61xQjTz0ROFQBhf2sDsuhHhiYVLPj7
 JwYktpbBiPeaw2+I18NPymNPY+VfY8LCTgLl5M+rbKyCqebKaedZQJ7QXFhAEqKC
 Y2f+gJsKWtTDzGP2mk/5kF0uP7cd0vJK35ZCXtLZ9BbcNtFZU6w+ADqRo4pJBHRY
 QRzo/AWrdkuTJF0CrP6mcneNC7NwWLSdKrE1z77RQCHUPVvhHhRDZsgdLcZ/KKwx
 y1ji22trwNB+7LmI2fUOU5RRHZBIuNvQT+mPt24febJuHpZKul62dd3cqTGeSTC+
 MYVknYDSg/+jk+83DhuZnTyb9lWTbq/0Q1HRDu6l2LrMIH7YMPpY5Ea64ZFYzWXy
 s0+iHEM4mUzltwNauHIntjbwXi3C0l2k1WQyG0gun2eS6SXfu0lb93V4msFj/N1+
 oHavH2n2A4XrRr+Ob87fsl7nfXJibWP7R9xPblrWP2sNdqfjSyGd49rnsvpWqWMK
 Fj0d7tQ78+/SwA==
 =tWXS
 -----END PGP SIGNATURE-----

Merge tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 entry code updates from Thomas Gleixner:

 - Convert the 32bit syscalls to be pt_regs based which removes the
   requirement to push all 6 potential arguments onto the stack and
   consolidates the interface with the 64bit variant

 - The first small portion of the exception and syscall related entry
   code consolidation which aims to address the recently discovered
   issues vs. RCU, int3, NMI and some other exceptions which can
   interrupt any context. The bulk of the changes is still work in
   progress and aimed for 5.8.

 - A few lockdep namespace cleanups which have been applied into this
   branch to keep the prerequisites for the ongoing work confined.

* tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS
  lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}()
  lockdep: Rename trace_softirqs_{on,off}()
  lockdep: Rename trace_hardirq_{enter,exit}()
  x86/entry: Rename ___preempt_schedule
  x86: Remove unneeded includes
  x86/entry: Drop asmlinkage from syscalls
  x86/entry/32: Enable pt_regs based syscalls
  x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments
  x86/entry/32: Rename 32-bit specific syscalls
  x86/entry/32: Clean up syscall_32.tbl
  x86/entry: Remove ABI prefixes from functions in syscall tables
  x86/entry/64: Add __SYSCALL_COMMON()
  x86/entry: Remove syscall qualifier support
  x86/entry/64: Remove ptregs qualifier from syscall table
  x86/entry: Move max syscall number calculation to syscallhdr.sh
  x86/entry/64: Split X32 syscall table into its own file
  x86/entry/64: Move sys_ni_syscall stub to common.c
  x86/entry/64: Use syscall wrappers for x32_rt_sigreturn
  x86/entry: Refactor SYS_NI macros
  ...
2020-03-30 19:14:28 -07:00
Linus Torvalds
2d385336af Updates for the interrupt subsystem:
Treewide:
 
     - Cleanup of setup_irq() which is not longer required because the
       memory allocator is available early. Most cleanup changes come
       through the various maintainer trees, so the final removal of
       setup_irq() is postponed towards the end of the merge window.
 
   Core:
 
     - Protection against unsafe invocation of interrupt handlers and unsafe
       interrupt injection including a fixup of the offending PCI/AER error
       injection mechanism.
 
       Invoking interrupt handlers from arbitrary contexts, i.e. outside of
       an actual interrupt, can cause inconsistent state on the fragile
       x86 interrupt affinity changing hardware trainwreck.
 
   Drivers:
 
     - Second wave of support for the new ARM GICv4.1
     - Multi-instance support for Xilinx and PLIC interrupt controllers
     - CPU-Hotplug support for PLIC
     - The obligatory new driver for X1000 TCU
     - Enhancements, cleanups and fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6B888THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeMJD/9v8GcI/DSY87Fmo7s4odLFVU0J8zZ6
 7QlYjSPm4yWv4pqn1TEnEF2pKz5X9Euhoh8BmdMKtdXBqlS4Ix9N+pH8ModcxyQo
 aX97zuRUxvqfeeVE+yQRwbbMREj9jj9RW8FRtA39+l5H3uC1GDcc+2aAMIaykQ7+
 8lo/6wBd8ZrZ0gsNf4KjlBwMDYAlQSRWxrff38PQ2XRpGKowdp8JFYZuq5Vp0ljJ
 r2cE75ldmFSfmtuhhVroBRY0GAqW4/8v8/syAN3Q9jOEII60qhA0dqR085B9veWa
 DHSqgLmzyUFFXN7Ntzt/fDirJVsIM4BE9qGu3ftCYHMaPB8hG+xqjbZe9E3D2e/d
 +0Pb3TG8EHVOIwzv1t9+6462qYGkBhmBXtbj6GptPYk2Ai4HZlNaSsa8jUNyHvGz
 WDegdRjt7O5RjqDH/VwrQxW/AEp05f/1egweBXbq9aF6j9nqeOur75c/PdxZxAX5
 WUMtouXP2WN+sMW8k1T5cmVMGWxLGBB0wwG4LC/mXzHnkDiN1+2wEUHmhS8Voi3q
 3HXeYBJeukUYbVvMKRvWVAD330TxFjAyd6pPwCdoNY2ZngJnQWlDD9vbYYX2osoW
 kP+KhIANNBVqdK7NqlLoqcr3SdHn01pQYuVHejNzxb7E6/mmpMlaYDJc/rMPi/eM
 0/rzl8fAj/WyBQ==
 =DZ/G
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates for the interrupt subsystem:

  Treewide:

    - Cleanup of setup_irq() which is not longer required because the
      memory allocator is available early.

      Most cleanup changes come through the various maintainer trees, so
      the final removal of setup_irq() is postponed towards the end of
      the merge window.

  Core:

    - Protection against unsafe invocation of interrupt handlers and
      unsafe interrupt injection including a fixup of the offending
      PCI/AER error injection mechanism.

      Invoking interrupt handlers from arbitrary contexts, i.e. outside
      of an actual interrupt, can cause inconsistent state on the
      fragile x86 interrupt affinity changing hardware trainwreck.

  Drivers:

    - Second wave of support for the new ARM GICv4.1

    - Multi-instance support for Xilinx and PLIC interrupt controllers

    - CPU-Hotplug support for PLIC

    - The obligatory new driver for X1000 TCU

    - Enhancements, cleanups and fixes all over the place"

* tag 'irq-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
  unicore32: Replace setup_irq() by request_irq()
  sh: Replace setup_irq() by request_irq()
  hexagon: Replace setup_irq() by request_irq()
  c6x: Replace setup_irq() by request_irq()
  alpha: Replace setup_irq() by request_irq()
  irqchip/gic-v4.1: Eagerly vmap vPEs
  irqchip/gic-v4.1: Add VSGI property setup
  irqchip/gic-v4.1: Add VSGI allocation/teardown
  irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer
  irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacks
  irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacks
  irqchip/gic-v4.1: Plumb mask/unmask SGI callbacks
  irqchip/gic-v4.1: Add initial SGI configuration
  irqchip/gic-v4.1: Plumb skeletal VSGI irqchip
  irqchip/stm32: Retrigger both in eoi and unmask callbacks
  irqchip/gic-v3: Move irq_domain_update_bus_token to after checking for NULL domain
  irqchip/xilinx: Do not call irq_set_default_host()
  irqchip/xilinx: Enable generic irq multi handler
  irqchip/xilinx: Fill error code when irq domain registration fails
  irqchip/xilinx: Add support for multiple instances
  ...
2020-03-30 17:35:14 -07:00
Linus Torvalds
4b9fd8a829 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Continued user-access cleanups in the futex code.

   - percpu-rwsem rewrite that uses its own waitqueue and atomic_t
     instead of an embedded rwsem. This addresses a couple of
     weaknesses, but the primary motivation was complications on the -rt
     kernel.

   - Introduce raw lock nesting detection on lockdep
     (CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal
     lock differences. This too originates from -rt.

   - Reuse lockdep zapped chain_hlocks entries, to conserve RAM
     footprint on distro-ish kernels running into the "BUG:
     MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep
     chain-entries pool.

   - Misc cleanups, smaller fixes and enhancements - see the changelog
     for details"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
  fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t
  thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t
  Documentation/locking/locktypes: Minor copy editor fixes
  Documentation/locking/locktypes: Further clarifications and wordsmithing
  m68knommu: Remove mm.h include from uaccess_no.h
  x86: get rid of user_atomic_cmpxchg_inatomic()
  generic arch_futex_atomic_op_inuser() doesn't need access_ok()
  x86: don't reload after cmpxchg in unsafe_atomic_op2() loop
  x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end()
  objtool: whitelist __sanitizer_cov_trace_switch()
  [parisc, s390, sparc64] no need for access_ok() in futex handling
  sh: no need of access_ok() in arch_futex_atomic_op_inuser()
  futex: arch_futex_atomic_op_inuser() calling conventions change
  completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all()
  lockdep: Add posixtimer context tracing bits
  lockdep: Annotate irq_work
  lockdep: Add hrtimer context tracing bits
  lockdep: Introduce wait-type checks
  completion: Use simple wait queues
  sched/swait: Prepare usage in completions
  ...
2020-03-30 16:17:15 -07:00
Edward Cree
df81dfcfd6 genirq: Fix reference leaks on irq affinity notifiers
The handling of notify->work did not properly maintain notify->kref in two
 cases:
1) where the work was already scheduled, another irq_set_affinity_locked()
   would get the ref and (no-op-ly) schedule the work.  Thus when
   irq_affinity_notify() ran, it would drop the original ref but not the
   additional one.
2) when cancelling the (old) work in irq_set_affinity_notifier(), if there
   was outstanding work a ref had been got for it but was never put.
Fix both by checking the return values of the work handling functions
 (schedule_work() for (1) and cancel_work_sync() for (2)) and put the
 extra ref if the return value indicates preexisting work.

Fixes: cd7eab44e9 ("genirq: Add IRQ affinity notifiers")
Fixes: 59c39840f5 ("genirq: Prevent use-after-free and work list corruption")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ben Hutchings <ben@decadent.org.uk>
Link: https://lkml.kernel.org/r/24f5983f-2ab5-e83a-44ee-a45b5f9300f5@solarflare.com
2020-03-21 17:32:46 +01:00
Peter Zijlstra
de8f5e4f2d lockdep: Introduce wait-type checks
Extend lockdep to validate lock wait-type context.

The current wait-types are:

	LD_WAIT_FREE,		/* wait free, rcu etc.. */
	LD_WAIT_SPIN,		/* spin loops, raw_spinlock_t etc.. */
	LD_WAIT_CONFIG,		/* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
	LD_WAIT_SLEEP,		/* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().

Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.

Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.

[ To make static initialization easier we have the rule that:
  .outer == INV means .outer == .inner; because INV == 0. ]

It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.

Below is an example output generated by the trivial test code:

  raw_spin_lock(&foo);
  spin_lock(&bar);
  spin_unlock(&bar);
  raw_spin_unlock(&foo);

 [ BUG: Invalid wait context ]
 -----------------------------
 swapper/0/1 is trying to lock:
 ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
 other info that might help us debug this:
 1 lock held by swapper/0/1:
  #0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187

The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.

This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.

Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.

The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.

The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.

[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
	   failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de
2020-03-21 16:00:24 +01:00
Alexander Sverdlin
87f2d1c662 genirq/irqdomain: Check pointer in irq_domain_alloc_irqs_hierarchy()
irq_domain_alloc_irqs_hierarchy() has 3 call sites in the compilation unit
but only one of them checks for the pointer which is being dereferenced
inside the called function. Move the check into the function. This allows
for catching the error instead of the following crash:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
PC is at 0x0
LR is at gpiochip_hierarchy_irq_domain_alloc+0x11f/0x140
...
[<c06c23ff>] (gpiochip_hierarchy_irq_domain_alloc)
[<c0462a89>] (__irq_domain_alloc_irqs)
[<c0462dad>] (irq_create_fwspec_mapping)
[<c06c2251>] (gpiochip_to_irq)
[<c06c1c9b>] (gpiod_to_irq)
[<bf973073>] (gpio_irqs_init [gpio_irqs])
[<bf974048>] (gpio_irqs_exit+0xecc/0xe84 [gpio_irqs])
Code: bad PC value

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200306174720.82604-1-alexander.sverdlin@nokia.com
2020-03-08 13:03:41 +01:00
Thomas Gleixner
acd26bcf36 genirq: Provide interrupt injection mechanism
Error injection mechanisms need a half ways safe way to inject interrupts as
invoking generic_handle_irq() or the actual device interrupt handler
directly from e.g. a debugfs write is not guaranteed to be safe.

On x86 generic_handle_irq() is unsafe due to the hardware trainwreck which
is the base of x86 interrupt delivery and affinity management.

Move the irq debugfs injection code into a separate function which can be
used by error injection code as well.

The implementation prevents at least that state is corrupted, but it cannot
close a very tiny race window on x86 which might result in a stale and not
serviced device interrupt under very unlikely circumstances.

This is explicitly for debugging and testing and not for production use or
abuse in random driver code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.990928309@linutronix.de
2020-03-08 11:06:42 +01:00
Thomas Gleixner
da90921acc genirq: Sanitize state handling in check_irq_resend()
The code sets IRQS_REPLAY unconditionally whether the resend happens or
not. That doesn't have bad side effects right now, but inconsistent state
is always a latent source of problems.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.882129117@linutronix.de
2020-03-08 11:06:41 +01:00
Thomas Gleixner
1f85b1f5e1 genirq: Add return value to check_irq_resend()
In preparation for an interrupt injection interface which can be used
safely by error injection mechanisms. e.g. PCI-E/ AER, add a return value
to check_irq_resend() so errors can be propagated to the caller.

Split out the software resend code so the ugly #ifdef in check_irq_resend()
goes away and the whole thing becomes readable.

Fix up the caller in debugfs. The caller in irq_startup() does not care
about the return value as this is unconditionally invoked for all
interrupts and the resend is best effort anyway.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.775200917@linutronix.de
2020-03-08 11:06:41 +01:00
Thomas Gleixner
c16816acd0 genirq: Add protection against unsafe usage of generic_handle_irq()
In general calling generic_handle_irq() with interrupts disabled from non
interrupt context is harmless. For some interrupt controllers like the x86
trainwrecks this is outright dangerous as it might corrupt state if an
interrupt affinity change is pending.

Add infrastructure which allows to mark interrupts as unsafe and catch such
usage in generic_handle_irq().

Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.590923677@linutronix.de
2020-03-08 11:06:40 +01:00
Thomas Gleixner
a740a423c3 genirq/debugfs: Add missing sanity checks to interrupt injection
Interrupts cannot be injected when the interrupt is not activated and when
a replay is already in progress.

Fixes: 536e2e34bd ("genirq/debugfs: Triggering of interrupts from userspace")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200306130623.500019114@linutronix.de
2020-03-08 11:06:40 +01:00
luanshi
b513df6780 irqdomain: Fix function documentation of __irq_domain_alloc_fwnode()
The function got renamed at some point, but the kernel-doc was not updated.

Signed-off-by: luanshi <zhangliguang@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1583200125-58806-1-git-send-email-zhangliguang@linux.alibaba.com
2020-03-08 11:02:24 +01:00
Thomas Gleixner
cba6437a18 genirq/proc: Reject invalid affinity masks (again)
Qian Cai reported that the WARN_ON() in the x86/msi affinity setting code,
which catches cases where the affinity setting is not done on the CPU which
is the current target of the interrupt, triggers during CPU hotplug stress
testing.

It turns out that the warning which was added with the commit addressing
the MSI affinity race unearthed yet another long standing bug.

If user space writes a bogus affinity mask, i.e. it contains no online CPUs,
then it calls irq_select_affinity_usr(). This was introduced for ALPHA in

  eee45269b0 ("[PATCH] Alpha: convert to generic irq framework (generic part)")

and subsequently made available for all architectures in

  1840475676 ("genirq: Expose default irq affinity mask (take 3)")

which introduced the circumvention of the affinity setting restrictions for
interrupt which cannot be moved in process context.

The whole exercise is bogus in various aspects:

  1) If the interrupt is already started up then there is absolutely
     no point to honour a bogus interrupt affinity setting from user
     space. The interrupt is already assigned to an online CPU and it
     does not make any sense to reassign it to some other randomly
     chosen online CPU.

  2) If the interupt is not yet started up then there is no point
     either. A subsequent startup of the interrupt will invoke
     irq_setup_affinity() anyway which will chose a valid target CPU.

So the only correct solution is to just return -EINVAL in case user space
wrote an affinity mask which does not contain any online CPUs, except for
ALPHA which has it's own magic sauce for this.

Fixes: 1840475676 ("genirq: Expose default irq affinity mask (take 3)")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Qian Cai <cai@lca.pw>
Link: https://lkml.kernel.org/r/878sl8xdbm.fsf@nanos.tec.linutronix.de
2020-02-14 09:43:17 +01:00
Linus Torvalds
1a2a76c268 A set of fixes for X86:
- Ensure that the PIT is set up when the local APIC is disable or
    configured in legacy mode. This is caused by an ordering issue
    introduced in the recent changes which skip PIT initialization when the
    TSC and APIC frequencies are already known.
 
  - Handle malformed SRAT tables during early ACPI parsing which caused an
    infinite loop anda boot hang.
 
  - Fix a long standing race in the affinity setting code which affects PCI
    devices with non-maskable MSI interrupts. The problem is caused by the
    non-atomic writes of the MSI address (destination APIC id) and data
    (vector) fields which the device uses to construct the MSI message. The
    non-atomic writes are mandated by PCI.
 
    If both fields change and the device raises an interrupt after writing
    address and before writing data, then the MSI block constructs a
    inconsistent message which causes interrupts to be lost and subsequent
    malfunction of the device.
 
    The fix is to redirect the interrupt to the new vector on the current
    CPU first and then switch it over to the new target CPU. This allows to
    observe an eventually raised interrupt in the transitional stage (old
    CPU, new vector) to be observed in the APIC IRR and retriggered on the
    new target CPU and the new vector. The potential spurious interrupts
    caused by this are harmless and can in the worst case expose a buggy
    driver (all handlers have to be able to deal with spurious interrupts as
    they can and do happen for various reasons).
 
  - Add the missing suspend/resume mechanism for the HYPERV hypercall page
    which prevents resume hibernation on HYPERV guests. This change got
    lost before the merge window.
 
  - Mask the IOAPIC before disabling the local APIC to prevent potentially
    stale IOAPIC remote IRR bits which cause stale interrupt lines after
    resume.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl5AEJwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoWY2D/47ur9gsVQGryKzneVAr0SCsq4Un11e
 uifX4ldu4gCEBRTYhpgcpiFKeLvY/QJ6uOD+gQUHyy/s+lCf6yzE6UhXEqSCtcT7
 LkSxD8jAFf6KhMA6iqYBfyxUsPMXBetLjjHWsyc/kf15O/vbYm7qf05timmNZkDS
 S7C+yr3KRqRjLR7G7t4twlgC9aLcNUQihUdsH2qyTvjnlkYHJLDa0/Js7bFYYKVx
 9GdUDLvPFB1mZ76g012De4R3kJsWitiyLlQ38DP5VysKulnszUCdiXlgCEFrgxvQ
 OQhLafQzOAzvxQmP+1alODR0dmJZA8k0zsDeeTB/vTpRvv6+Pe2qUswLSpauBzuq
 TpDsrv8/5pwZh28+91f/Unk+tH8NaVNtGe/Uf+ePxIkn1nbqL84o4NHGplM6R97d
 HAWdZQZ1cGRLf6YRRJ+57oM/5xE3vBbF1Wn0+QDTFwdsk2vcxuQ4eB3M/8E1V7Zk
 upp8ty50bZ5+rxQ8XTq/eb8epSRnfLoBYpi4ux6MIOWRdmKDl40cDeZCzA2kNP7m
 qY1haaRN3ksqvhzc0Yf6cL+CgvC4ur8gRHezfOqmBzVoaLyVEFIVjgjR/ojf0bq8
 /v+L9D5+IdIv4jEZruRRs0gOXNDzoBbvf0qKGaO0tUTWiDsv7c5AGixp8aozniHS
 HXsv1lIpRuC7WQ==
 =WxKD
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Ensure that the PIT is set up when the local APIC is disable or
     configured in legacy mode. This is caused by an ordering issue
     introduced in the recent changes which skip PIT initialization when
     the TSC and APIC frequencies are already known.

   - Handle malformed SRAT tables during early ACPI parsing which caused
     an infinite loop anda boot hang.

   - Fix a long standing race in the affinity setting code which affects
     PCI devices with non-maskable MSI interrupts. The problem is caused
     by the non-atomic writes of the MSI address (destination APIC id)
     and data (vector) fields which the device uses to construct the MSI
     message. The non-atomic writes are mandated by PCI.

     If both fields change and the device raises an interrupt after
     writing address and before writing data, then the MSI block
     constructs a inconsistent message which causes interrupts to be
     lost and subsequent malfunction of the device.

     The fix is to redirect the interrupt to the new vector on the
     current CPU first and then switch it over to the new target CPU.
     This allows to observe an eventually raised interrupt in the
     transitional stage (old CPU, new vector) to be observed in the APIC
     IRR and retriggered on the new target CPU and the new vector.

     The potential spurious interrupts caused by this are harmless and
     can in the worst case expose a buggy driver (all handlers have to
     be able to deal with spurious interrupts as they can and do happen
     for various reasons).

   - Add the missing suspend/resume mechanism for the HYPERV hypercall
     page which prevents resume hibernation on HYPERV guests. This
     change got lost before the merge window.

   - Mask the IOAPIC before disabling the local APIC to prevent
     potentially stale IOAPIC remote IRR bits which cause stale
     interrupt lines after resume"

* tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Mask IOAPIC entries when disabling the local APIC
  x86/hyperv: Suspend/resume the hypercall page for hibernation
  x86/apic/msi: Plug non-maskable MSI affinity race
  x86/boot: Handle malformed SRAT tables during early ACPI parsing
  x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
2020-02-09 12:11:12 -08:00
Stephen Boyd
f9f21cea31 genirq: Clarify that irq wake state is orthogonal to enable/disable
There's some confusion around if an irq that's disabled with disable_irq()
can still wake the system from sleep states such as "suspend to RAM".

Clarify this in the kernel documentation for irq_set_irq_wake() so that
it's clear that an irq can be disabled and still wake the system if it has
been marked for wakeup.

Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lkml.kernel.org/r/20200206191521.94559-1-swboyd@chromium.org
2020-02-07 21:37:08 +01:00
Alexey Dobriyan
97a32539b9 proc: convert everything to "struct proc_ops"
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
seq_file.h.

Conversion rule is:

	llseek		=> proc_lseek
	unlocked_ioctl	=> proc_ioctl

	xxx		=> proc_xxx

	delete ".owner = THIS_MODULE" line

[akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
[sfr@canb.auug.org.au: fix kernel/sched/psi.c]
  Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:26 +00:00
Thomas Gleixner
6f1a4891a5 x86/apic/msi: Plug non-maskable MSI affinity race
Evan tracked down a subtle race between the update of the MSI message and
the device raising an interrupt internally on PCI devices which do not
support MSI masking. The update of the MSI message is non-atomic and
consists of either 2 or 3 sequential 32bit wide writes to the PCI config
space.

   - Write address low 32bits
   - Write address high 32bits (If supported by device)
   - Write data

When an interrupt is migrated then both address and data might change, so
the kernel attempts to mask the MSI interrupt first. But for MSI masking is
optional, so there exist devices which do not provide it. That means that
if the device raises an interrupt internally between the writes then a MSI
message is sent built from half updated state.

On x86 this can lead to spurious interrupts on the wrong interrupt
vector when the affinity setting changes both address and data. As a
consequence the device interrupt can be lost causing the device to
become stuck or malfunctioning.

Evan tried to handle that by disabling MSI accross an MSI message
update. That's not feasible because disabling MSI has issues on its own:

 If MSI is disabled the PCI device is routing an interrupt to the legacy
 INTx mechanism. The INTx delivery can be disabled, but the disablement is
 not working on all devices.

 Some devices lose interrupts when both MSI and INTx delivery are disabled.

Another way to solve this would be to enforce the allocation of the same
vector on all CPUs in the system for this kind of screwed devices. That
could be done, but it would bring back the vector space exhaustion problems
which got solved a few years ago.

Fortunately the high address (if supported by the device) is only relevant
when X2APIC is enabled which implies interrupt remapping. In the interrupt
remapping case the affinity setting is happening at the interrupt remapping
unit and the PCI MSI message is programmed only once when the PCI device is
initialized.

That makes it possible to solve it with a two step update:

  1) Target the MSI msg to the new vector on the current target CPU

  2) Target the MSI msg to the new vector on the new target CPU

In both cases writing the MSI message is only changing a single 32bit word
which prevents the issue of inconsistency.

After writing the final destination it is necessary to check whether the
device issued an interrupt while the intermediate state #1 (new vector,
current CPU) was in effect.

This is possible because the affinity change is always happening on the
current target CPU. The code runs with interrupts disabled, so the
interrupt can be detected by checking the IRR of the local APIC. If the
vector is pending in the IRR then the interrupt is retriggered on the new
target CPU by sending an IPI for the associated vector on the target CPU.

This can cause spurious interrupts on both the local and the new target
CPU.

 1) If the new vector is not in use on the local CPU and the device
    affected by the affinity change raised an interrupt during the
    transitional state (step #1 above) then interrupt entry code will
    ignore that spurious interrupt. The vector is marked so that the
    'No irq handler for vector' warning is supressed once.

 2) If the new vector is in use already on the local CPU then the IRR check
    might see an pending interrupt from the device which is using this
    vector. The IPI to the new target CPU will then invoke the handler of
    the device, which got the affinity change, even if that device did not
    issue an interrupt

 3) If the new vector is in use already on the local CPU and the device
    affected by the affinity change raised an interrupt during the
    transitional state (step #1 above) then the handler of the device which
    uses that vector on the local CPU will be invoked.

expose issues in device driver interrupt handlers which are not prepared to
handle a spurious interrupt correctly. This not a regression, it's just
exposing something which was already broken as spurious interrupts can
happen for a lot of reasons and all driver handlers need to be able to deal
with them.

Reported-by: Evan Green <evgreen@chromium.org>
Debugged-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Evan Green <evgreen@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de
2020-02-01 09:31:47 +01:00
Thomas Gleixner
43ee74487b irqchip updates for Linux 5.6:
- Conversion of the SiFive PLIC to hierarchical domains
 - New SiFive GPIO irqchip driver
 - New Aspeed SCI irqchip driver
 - New NXP INTMUX irqchip driver
 - Additional support for the Meson A1 GPIO irqchip
 - First part of the GICv4.1 support
 - Assorted fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl4rKn0PHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDHVoQALTTYQol+5Gz5pLxnROYEAdFjzrVrCarsK/b
 Cl4uVa5efOTCItSO3L9cEo1zoB++aJxPSOaKqX9hryPwPLTZzDiHYtVQ870tZB+k
 233cTvtT8+iw7/JPKnA8706TYDk1FUkJQ87V0gMLrnVH00dmJ8LvjW1bCdXV8iIa
 Ln78XIF+Ass+qJjSpCDRaOukDm6Qs+sZKAY0+nLXM8Ge564fdX7bPkDGN4tq9DLz
 74ZxY6s3rI5FoPceS270dtDf4Ib8gH+T8Bqd5AYSj/tcRE23s4muGb/O3Kez5Oko
 eEiuSadpep/kPQhgZlpX0tJgtEqHNfi6K8AIMscQQDFmJyuCqgR9/5as+UKX1V0M
 kPlOQtYCAVZmTnlOP6rA2V3RUFurVkFPkwUGzVYlCYxxrARvsH+vPxYqAPH/EEFq
 lGUo+2Z7Z+1ubPsnR8WKs8heC6qJidegGUtKoKYWroJl+tiuT6EtCP3J0QZPhdXT
 lVOBVnR6DHNIURuAEmag/eNYsBIj7PdmlByoMkBFn9LPE7Fn+OExJgbyVsu1IaTe
 AcUHmXR9QpcAKnDLmNSqFvhWsLo8CJ607rH3tL8vqnfijOHyt4AvKeE1R4QSavPx
 0F3FFNdo7Y1FAlJ9Ibw0gLvoIa6uP6FpdI3rht0iRaOZJlnDTbn+B8UayY0Ajvyp
 aGIjx7tY
 =8iz1
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

- Conversion of the SiFive PLIC to hierarchical domains
- New SiFive GPIO irqchip driver
- New Aspeed SCI irqchip driver
- New NXP INTMUX irqchip driver
- Additional support for the Meson A1 GPIO irqchip
- First part of the GICv4.1 support
- Assorted fixes
2020-01-24 20:08:51 +01:00
Ming Lei
11ea68f553 genirq, sched/isolation: Isolate from handling managed interrupts
The affinity of managed interrupts is completely handled in the kernel and
cannot be changed via the /proc/irq/* interfaces from user space. As the
kernel tries to spread out interrupts evenly accross CPUs on x86 to prevent
vector exhaustion, it can happen that a managed interrupt whose affinity
mask contains both isolated and housekeeping CPUs is routed to an isolated
CPU. As a consequence IO submitted on a housekeeping CPU causes interrupts
on the isolated CPU.

Add a new sub-parameter 'managed_irq' for 'isolcpus' and the corresponding
logic in the interrupt affinity selection code.

The subparameter indicates to the interrupt affinity selection logic that
it should try to avoid the above scenario.

This isolation is best effort and only effective if the automatically
assigned interrupt mask of a device queue contains isolated and
housekeeping CPUs. If housekeeping CPUs are online then such interrupts are
directed to the housekeeping CPU so that IO submitted on the housekeeping
CPU cannot disturb the isolated CPU.

If a queue's affinity mask contains only isolated CPUs then this parameter
has no effect on the interrupt routing decision, though interrupts are only
happening when tasks running on those isolated CPUs submit IO. IO submitted
on housekeeping CPUs has no influence on those queues.

If the affinity mask contains both housekeeping and isolated CPUs, but none
of the contained housekeeping CPUs is online, then the interrupt is also
routed to an isolated CPU. Interrupts are only delivered when one of the
isolated CPUs in the affinity mask submits IO. If one of the contained
housekeeping CPUs comes online, the CPU hotplug logic migrates the
interrupt automatically back to the upcoming housekeeping CPU. Depending on
the type of interrupt controller, this can require that at least one
interrupt is delivered to the isolated CPU in order to complete the
migration.

[ tglx: Removed unused parameter, added and edited comments/documentation
  	and rephrased the changelog so it contains more details. ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200120091625.17912-1-ming.lei@redhat.com
2020-01-22 16:29:49 +01:00
Kevin Hao
0f394daef8 irqdomain: Fix a memory leak in irq_domain_push_irq()
Fix a memory leak reported by kmemleak:
unreferenced object 0xffff000bc6f50e80 (size 128):
  comm "kworker/23:2", pid 201, jiffies 4294894947 (age 942.132s)
  hex dump (first 32 bytes):
    00 00 00 00 41 00 00 00 86 c0 03 00 00 00 00 00  ....A...........
    00 a0 b2 c6 0b 00 ff ff 40 51 fd 10 00 80 ff ff  ........@Q......
  backtrace:
    [<00000000e62d2240>] kmem_cache_alloc_trace+0x1a4/0x320
    [<00000000279143c9>] irq_domain_push_irq+0x7c/0x188
    [<00000000d9f4c154>] thunderx_gpio_probe+0x3ac/0x438
    [<00000000fd09ec22>] pci_device_probe+0xe4/0x198
    [<00000000d43eca75>] really_probe+0xdc/0x320
    [<00000000d3ebab09>] driver_probe_device+0x5c/0xf0
    [<000000005b3ecaa0>] __device_attach_driver+0x88/0xc0
    [<000000004e5915f5>] bus_for_each_drv+0x7c/0xc8
    [<0000000079d4db41>] __device_attach+0xe4/0x140
    [<00000000883bbda9>] device_initial_probe+0x18/0x20
    [<000000003be59ef6>] bus_probe_device+0x98/0xa0
    [<0000000039b03d3f>] deferred_probe_work_func+0x74/0xa8
    [<00000000870934ce>] process_one_work+0x1c8/0x470
    [<00000000e3cce570>] worker_thread+0x1f8/0x428
    [<000000005d64975e>] kthread+0xfc/0x128
    [<00000000f0eaa764>] ret_from_fork+0x10/0x18

Fixes: 495c38d300 ("irqdomain: Add irq_domain_{push,pop}_irq() functions")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200120043547.22271-1-haokexin@gmail.com
2020-01-20 19:10:05 +00:00
Yash Shah
b01ecceaf2 genirq: Introduce irq_domain_translate_onecell
Add a new function irq_domain_translate_onecell() that is to be used as
the translate function in struct irq_domain_ops.

Signed-off-by: Yash Shah <yash.shah@sifive.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1575976274-13487-2-git-send-email-yash.shah@sifive.com
2020-01-20 09:19:33 +00:00
Jules Irenge
099368bb10 genirq: Add missing __must_hold() sparse annotation
Add __must_hold() annotation to address the following sparse warning:

  warning: context imbalance in irq_wait_for_poll - unexpected unlock

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20191216144208.29852-2-jbi.octave@gmail.com
2020-01-09 18:03:37 +01:00
Jules Irenge
8b3b54799b genirq: Add missing __releases() sparse annotation
Add __releases() annotation to address the following sparse warning:

  warning: context imbalance in __irq_put_desc_unlock() - unexpected unlock

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20191216144208.29852-1-jbi.octave@gmail.com
2020-01-09 18:03:24 +01:00
Luca Ceresoli
025af39b87 genirq: Show irq name in non-oneshot error message
Requesting a threaded IRQ with handler=NULL and !ONESHOT fails, but the
error message does not include the IRQ line name, which makes it harder to
find the offending driver.

Print the IRQ line name to clarify where the error comes from. Use the same
format as the other pr_err() above in the same function.

Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20191105140854.27893-1-luca@lucaceresoli.net
2020-01-09 15:42:54 +01:00
Thomas Gleixner
407e62f52a irqchip updates for Linux 5.5
- Qualcomm PDC wakeup interrupt support
 - Layerscape external IRQ support
 - Broadcom bcm7038 PM and wakeup support
 - Ingenic driver cleanup and modernization
 - GICv3 ITS preparation for GICv4.1 updates
 - GICv4 fixes
 - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl3VJasPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDMLEP/2U55GLmPuNqW/na4YFmcOYAkoP0WpEKDzn9
 u8lBi8CukKl1Z2JLAyP0E1e5iOVS0exvQ5V4OwxGKmeR9oWmB3Lym8UWRw7vcKEF
 HMKgtPCd3U3J5jW0P4Hr8hn6Q+B55fdMrrAaJcgsfBVB7bRB0lC0LZYGtN4VC4d8
 rTQzup5CK8Mu9k4NztLCxxoBUKFoqM+ZKsrRB2eOXB9amcPQtFvwC+5ZL2tDr3wS
 7d+pd6G4A+hsloIDUxoH9BrO/jd1jlfHyBRDFJIgpo/IQWVT6ciQECZomRR1pW30
 bGFYBf/HPFqfyH+ZOrWprSAd0Yx33WtYaMokYaJ6vGu4wedyxh/1LTRLzL0tWuyZ
 tPFvEmiiP/Hoeq1JHFRFQUO/75ckqALLeAxCjACCN8+F2Z0armk1W/iwehZNQHHV
 JdDXegRNUlMipG2kk5D3L6AK28bi+3+axc1ERMN1RO40eLm8NLogWL2TJlxLbyUe
 lMZMe43ceC0McGnQpAY8qyuC7IycQtngKNBvzG+6ADucGpFez3gYxh39RR43XMVo
 37Hsj+Ur7CFBJj6WTCzV2teC/WaXXQkJYxn6fsHNmUgdwPgGD3LppxhlWG49Ao9w
 x8ZnfyrrYmcFOJrKbT45ExMihioaGf8dyksKZNA/Z4dI0g/kf0LyYi5ujZaDDilI
 eDkMI/xI
 =uENR
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

 - Qualcomm PDC wakeup interrupt support
 - Layerscape external IRQ support
 - Broadcom bcm7038 PM and wakeup support
 - Ingenic driver cleanup and modernization
 - GICv3 ITS preparation for GICv4.1 updates
 - GICv4 fixes
 - Various cleanups
2019-11-20 14:16:34 +01:00
Maulik Shah
4a169a95d8 genirq: Introduce irq_chip_get/set_parent_state calls
On certain QTI chipsets some GPIOs are direct-connect interrupts to the
GIC to be used as regular interrupt lines. When the GPIOs are not used
for interrupt generation the interrupt line is disabled. But disabling
the interrupt at GIC does not prevent the interrupt to be reported as
pending at GIC_ISPEND. Later, when drivers call enable_irq() on the
interrupt, an unwanted interrupt occurs.

Introduce get and set methods for irqchip's parent to clear it's pending
irq state. This then can be invoked by the GPIO interrupt controller on
the parents in it hierarchy to clear the interrupt before enabling the
interrupt.

Signed-off-by: Maulik Shah <mkshah@codeaurora.org>
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/1573855915-9841-7-git-send-email-ilina@codeaurora.org

[updated commit text and minor code fixes]
2019-11-16 10:20:02 +00:00
luanshi
20a15ee040 genirq: Fix function documentation of __irq_alloc_descs()
The function got renamed at some point, but the kernel-doc was not updated.

Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1573656093-8643-1-git-send-email-zhangliguang@linux.alibaba.com
2019-11-15 10:48:38 +01:00
Yi Wang
0ed9ca2589 irq/irqdomain: Update __irq_domain_alloc_fwnode() function documentation
A recent commit changed a parameter of __irq_domain_alloc_fwnode(), but
did not update the documentation comment. Fix it up.

Fixes: b977fcf477 ("irqdomain/debugfs: Use PAs to generate fwnode names")
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1571476047-29463-1-git-send-email-wang.yi59@zte.com.cn
2019-11-05 00:48:26 +01:00
Linus Torvalds
77dcfe2b9e Power management updates for 5.4-rc1
- Rework the main suspend-to-idle control flow to avoid repeating
    "noirq" device resume and suspend operations in case of spurious
    wakeups from the ACPI EC and decouple the ACPI EC wakeups support
    from the LPS0 _DSM support (Rafael Wysocki).
 
  - Extend the wakeup sources framework to expose wakeup sources as
    device objects in sysfs (Tri Vo, Stephen Boyd).
 
  - Expose system suspend statistics in sysfs (Kalesh Singh).
 
  - Introduce a new haltpoll cpuidle driver and a new matching
    governor for virtualized guests wanting to do guest-side polling
    in the idle loop (Marcelo Tosatti, Joao Martins, Wanpeng Li,
    Stephen Rothwell).
 
  - Fix the menu and teo cpuidle governors to allow the scheduler tick
    to be stopped if PM QoS is used to limit the CPU idle state exit
    latency in some cases (Rafael Wysocki).
 
  - Increase the resolution of the play_idle() argument to microseconds
    for more fine-grained injection of CPU idle cycles (Daniel Lezcano).
 
  - Switch over some users of cpuidle notifiers to the new QoS-based
    frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY
    policy notifier events (Viresh Kumar).
 
  - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li).
 
  - Add support for MT8183 and MT8516 to the mediatek cpufreq driver
    (Andrew-sh.Cheng, Fabien Parent).
 
  - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson
    Huang).
 
  - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz).
 
  - Update the qcom cpufreq driver (among other things, to make it
    easier to extend and to use kryo cpufreq for other nvmem-based
    SoCs) and add qcs404 support to it  (Niklas Cassel, Douglas
    RAILLARD, Sibi Sankar, Sricharan R).
 
  - Fix assorted issues and make assorted minor improvements in the
    cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli,
    Gustavo Silva, Hariprasad Kelam).
 
  - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd
    Bergmann).
 
  - Add new Exynos PPMU events to devfreq events and extend that
    mechanism (Lukasz Luba).
 
  - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny).
 
  - Improve devfreq documentation and governor code, fix spelling
    typos in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard
    Crestez, MyungJoo Ham, Gaël PORTAY).
 
  - Add regulators enable and disable to the OPP (operating performance
    points) framework (Kamil Konieczny).
 
  - Update the OPP framework to support multiple opp-suspend properties
    (Anson Huang).
 
  - Fix assorted issues and make assorted minor improvements in the OPP
    code (Niklas Cassel, Viresh Kumar, Yue Hu).
 
  - Clean up the generic power domains (genpd) framework (Ulf Hansson).
 
  - Clean up assorted pieces of power management code and documentation
    (Akinobu Mita, Amit Kucheria, Chuhong Yuan).
 
  - Update the pm-graph tool to version 5.5 including multiple fixes
    and improvements (Todd Brandt).
 
  - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven,
    Sébastien Szymanski).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl2ArZ4SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxgfYQAK80hs43vWQDmp7XKrN4pQe8+qYULAGO
 fBfrFl+NG9y/cnuqnt3NtA8MoyNsMMkMLkpkEDMfSbYqqH5ehEzX5+uGJWiWx8+Y
 oH5KU8MH7Tj/utYaalGzDt0AHfHZDIGC0NCUNQJVtE/4mOANFabwsCwscp4MrD5Q
 WjFN8U4BrsmWgJdZ/U9QIWcDZ0I+1etCF+rZG2yxSv31FMq2Zk/Qm4YyobqCvQFl
 TR9rxl08wqUmIYIz5cDjt/3AKH7NLLDqOTstbCL7cmufM5XPFc1yox69xc89UrIa
 4AMgmDp7SMwFG/gdUPof0WQNmx7qxmiRAPleAOYBOZW/8jPNZk2y+RhM5NeF72m7
 AFqYiuxqatkSb4IsT8fLzH9IUZOdYr8uSmoMQECw+MHdApaKFjFV8Lb/qx5+AwkD
 y7pwys8dZSamAjAf62eUzJDWcEwkNrujIisGrIXrVHb7ISbweskMOmdAYn9p4KgP
 dfRzpJBJ45IaMIdbaVXNpg3rP7Apfs7X1X+/ZhG6f+zHH3zYwr8Y81WPqX8WaZJ4
 qoVCyxiVWzMYjY2/1lzjaAdqWojPWHQ3or3eBaK52DouyG3jY6hCDTLwU7iuqcCX
 jzAtrnqrNIKufvaObEmqcmYlIIOFT7QaJCtGUSRFQLfSon8fsVSR7LLeXoAMUJKT
 JWQenuNaJngK
 =TBDQ
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These include a rework of the main suspend-to-idle code flow (related
  to the handling of spurious wakeups), a switch over of several users
  of cpufreq notifiers to QoS-based limits, a new devfreq driver for
  Tegra20, a new cpuidle driver and governor for virtualized guests, an
  extension of the wakeup sources framework to expose wakeup sources as
  device objects in sysfs, and more.

  Specifics:

   - Rework the main suspend-to-idle control flow to avoid repeating
     "noirq" device resume and suspend operations in case of spurious
     wakeups from the ACPI EC and decouple the ACPI EC wakeups support
     from the LPS0 _DSM support (Rafael Wysocki).

   - Extend the wakeup sources framework to expose wakeup sources as
     device objects in sysfs (Tri Vo, Stephen Boyd).

   - Expose system suspend statistics in sysfs (Kalesh Singh).

   - Introduce a new haltpoll cpuidle driver and a new matching governor
     for virtualized guests wanting to do guest-side polling in the idle
     loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell).

   - Fix the menu and teo cpuidle governors to allow the scheduler tick
     to be stopped if PM QoS is used to limit the CPU idle state exit
     latency in some cases (Rafael Wysocki).

   - Increase the resolution of the play_idle() argument to microseconds
     for more fine-grained injection of CPU idle cycles (Daniel
     Lezcano).

   - Switch over some users of cpuidle notifiers to the new QoS-based
     frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY
     policy notifier events (Viresh Kumar).

   - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li).

   - Add support for MT8183 and MT8516 to the mediatek cpufreq driver
     (Andrew-sh.Cheng, Fabien Parent).

   - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson
     Huang).

   - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz).

   - Update the qcom cpufreq driver (among other things, to make it
     easier to extend and to use kryo cpufreq for other nvmem-based
     SoCs) and add qcs404 support to it (Niklas Cassel, Douglas
     RAILLARD, Sibi Sankar, Sricharan R).

   - Fix assorted issues and make assorted minor improvements in the
     cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli,
     Gustavo Silva, Hariprasad Kelam).

   - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd
     Bergmann).

   - Add new Exynos PPMU events to devfreq events and extend that
     mechanism (Lukasz Luba).

   - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny).

   - Improve devfreq documentation and governor code, fix spelling typos
     in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez,
     MyungJoo Ham, Gaël PORTAY).

   - Add regulators enable and disable to the OPP (operating performance
     points) framework (Kamil Konieczny).

   - Update the OPP framework to support multiple opp-suspend properties
     (Anson Huang).

   - Fix assorted issues and make assorted minor improvements in the OPP
     code (Niklas Cassel, Viresh Kumar, Yue Hu).

   - Clean up the generic power domains (genpd) framework (Ulf Hansson).

   - Clean up assorted pieces of power management code and documentation
     (Akinobu Mita, Amit Kucheria, Chuhong Yuan).

   - Update the pm-graph tool to version 5.5 including multiple fixes
     and improvements (Todd Brandt).

   - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven,
     Sébastien Szymanski)"

* tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (126 commits)
  cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available
  cpuidle-haltpoll: do not set an owner to allow modunload
  cpuidle-haltpoll: return -ENODEV on modinit failure
  cpuidle-haltpoll: set haltpoll as preferred governor
  cpuidle: allow governor switch on cpuidle_register_driver()
  PM: runtime: Documentation: add runtime_status ABI document
  pm-graph: make setVal unbuffered again for python2 and python3
  powercap: idle_inject: Use higher resolution for idle injection
  cpuidle: play_idle: Increase the resolution to usec
  cpuidle-haltpoll: vcpu hotplug support
  cpufreq: Add qcs404 to cpufreq-dt-platdev blacklist
  cpufreq: qcom: Add support for qcs404 on nvmem driver
  cpufreq: qcom: Refactor the driver to make it easier to extend
  cpufreq: qcom: Re-organise kryo cpufreq to use it for other nvmem based qcom socs
  dt-bindings: opp: Add qcom-opp bindings with properties needed for CPR
  dt-bindings: opp: qcom-nvmem: Support pstates provided by a power domain
  Documentation: cpufreq: Update policy notifier documentation
  cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events
  PM / Domains: Verify PM domain type in dev_pm_genpd_set_performance_state()
  PM / Domains: Simplify genpd_lookup_dev()
  ...
2019-09-17 19:15:14 -07:00
Linus Torvalds
a572ba6329 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core irq updates from Thomas Gleixner:
 "Updates from the irq departement:

   - Update the interrupt spreading code so it handles numa node with
     different CPU counts properly.

   - A large overhaul of the ARM GiCv3 driver to support new PPI and SPI
     ranges.

   - Conversion of all alloc_fwnode() users to use physical addresses
     instead of virtual addresses so the virtual addresses are not
     leaked. The physical address is sufficient to identify the
     associated interrupt chip.

   - Add support for Marvel MMP3, Amlogic Meson SM1 interrupt chips.

   - Enforce interrupt threading at compile time if RT is enabled.

   - Small updates and improvements all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices
  irqchip/uniphier-aidet: Use devm_platform_ioremap_resource()
  irqdomain: Add the missing assignment of domain->fwnode for named fwnode
  irqchip/mmp: Coexist with GIC root IRQ controller
  irqchip/mmp: Mask off interrupts from other cores
  irqchip/mmp: Add missing chained_irq_{enter,exit}()
  irqchip/mmp: Do not use of_address_to_resource() to get mux regs
  irqchip/meson-gpio: Add support for meson sm1 SoCs
  dt-bindings: interrupt-controller: New binding for the meson sm1 SoCs
  genirq/affinity: Remove const qualifier from node_to_cpumask argument
  genirq/affinity: Spread vectors on node according to nr_cpu ratio
  genirq/affinity: Improve __irq_build_affinity_masks()
  irqchip: Remove dev_err() usage after platform_get_irq()
  irqchip: Add include guard to irq-partition-percpu.h
  irqchip/mmp: Do not call irq_set_default_host() on DT platforms
  irqchip/gic-v3-its: Remove the redundant set_bit for lpi_map
  irqchip/gic-v3: Add quirks for HIP06/07 invalid GICD_TYPER erratum 161010803
  irqchip/gic: Skip DT quirks when evaluating IIDR-based quirks
  irqchip/gic-v3: Warn about inconsistent implementations of extended ranges
  irqchip/gic-v3: Add EPPI range support
  ...
2019-09-17 11:42:15 -07:00
Rafael J. Wysocki
d281706369 Merge branch 'pm-sleep'
* pm-sleep: (29 commits)
  ACPI: PM: s2idle: Always set up EC GPE for system wakeup
  ACPI: PM: s2idle: Avoid rearming SCI for wakeup unnecessarily
  PM / wakeup: Unexport wakeup_source_sysfs_{add,remove}()
  PM / wakeup: Register wakeup class kobj after device is added
  PM / wakeup: Fix sysfs registration error path
  PM / wakeup: Show wakeup sources stats in sysfs
  PM / wakeup: Use wakeup_source_register() in wakelock.c
  PM / wakeup: Drop wakeup_source_init(), wakeup_source_prepare()
  PM: sleep: Replace strncmp() with str_has_prefix()
  PM: suspend: Fix platform_suspend_prepare_noirq()
  intel-hid: Disable button array during suspend-to-idle
  intel-hid: intel-vbtn: Avoid leaking wakeup_mode set
  ACPI: PM: s2idle: Execute LPS0 _DSM functions with suspended devices
  ACPI: EC: PM: Make acpi_ec_dispatch_gpe() print debug message
  ACPI: EC: PM: Consolidate some code depending on PM_SLEEP
  ACPI: PM: s2idle: Eliminate acpi_sleep_no_ec_events()
  ACPI: PM: s2idle: Switch EC over to polling during "noirq" suspend
  ACPI: PM: s2idle: Add acpi.sleep_no_lps0 module parameter
  ACPI: PM: s2idle: Rearrange lps0_device_attach()
  PM/sleep: Expose suspend stats in sysfs
  ...
2019-09-17 09:36:34 +02:00
Linus Torvalds
7e67a85999 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - MAINTAINERS: Add Mark Rutland as perf submaintainer, Juri Lelli and
   Vincent Guittot as scheduler submaintainers. Add Dietmar Eggemann,
   Steven Rostedt, Ben Segall and Mel Gorman as scheduler reviewers.

   As perf and the scheduler is getting bigger and more complex,
   document the status quo of current responsibilities and interests,
   and spread the review pain^H^H^H^H fun via an increase in the Cc:
   linecount generated by scripts/get_maintainer.pl. :-)

 - Add another series of patches that brings the -rt (PREEMPT_RT) tree
   closer to mainline: split the monolithic CONFIG_PREEMPT dependencies
   into a new CONFIG_PREEMPTION category that will allow the eventual
   introduction of CONFIG_PREEMPT_RT. Still a few more hundred patches
   to go though.

 - Extend the CPU cgroup controller with uclamp.min and uclamp.max to
   allow the finer shaping of CPU bandwidth usage.

 - Micro-optimize energy-aware wake-ups from O(CPUS^2) to O(CPUS).

 - Improve the behavior of high CPU count, high thread count
   applications running under cpu.cfs_quota_us constraints.

 - Improve balancing with SCHED_IDLE (SCHED_BATCH) tasks present.

 - Improve CPU isolation housekeeping CPU allocation NUMA locality.

 - Fix deadline scheduler bandwidth calculations and logic when cpusets
   rebuilds the topology, or when it gets deadline-throttled while it's
   being offlined.

 - Convert the cpuset_mutex to percpu_rwsem, to allow it to be used from
   setscheduler() system calls without creating global serialization.
   Add new synchronization between cpuset topology-changing events and
   the deadline acceptance tests in setscheduler(), which were broken
   before.

 - Rework the active_mm state machine to be less confusing and more
   optimal.

 - Rework (simplify) the pick_next_task() slowpath.

 - Improve load-balancing on AMD EPYC systems.

 - ... and misc cleanups, smaller fixes and improvements - please see
   the Git log for more details.

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  sched/psi: Correct overly pessimistic size calculation
  sched/fair: Speed-up energy-aware wake-ups
  sched/uclamp: Always use 'enum uclamp_id' for clamp_id values
  sched/uclamp: Update CPU's refcount on TG's clamp changes
  sched/uclamp: Use TG's clamps to restrict TASK's clamps
  sched/uclamp: Propagate system defaults to the root group
  sched/uclamp: Propagate parent clamps
  sched/uclamp: Extend CPU's cgroup controller
  sched/topology: Improve load balancing on AMD EPYC systems
  arch, ia64: Make NUMA select SMP
  sched, perf: MAINTAINERS update, add submaintainers and reviewers
  sched/fair: Use rq_lock/unlock in online_fair_sched_group
  cpufreq: schedutil: fix equation in comment
  sched: Rework pick_next_task() slow-path
  sched: Allow put_prev_task() to drop rq->lock
  sched/fair: Expose newidle_balance()
  sched: Add task_struct pointer to sched_class::set_curr_task
  sched: Rework CPU hotplug task selection
  sched/{rt,deadline}: Fix set_next_task vs pick_next_task
  sched: Fix kerneldoc comment for ia64_set_curr_task
  ...
2019-09-16 17:25:49 -07:00
Linus Torvalds
76f0f227cf ia64 for v5.4 - big change here is removal of support for SGI Altix
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdf64MAAoJEKurIx+X31iBB20P/07o93sBT92SiA2/ety9sLqV
 BGJmEdw7gyb9WVbUip6s71FIEKZw4foCGkqDiX+lr5Fw2A9tiK7LmFgTLi4LLwg+
 syhYZ1y5/mwBI4FLlJudKjQdFZjr/n7DNlz4H67woE2kK+FyRsOKEaFUhuR8+0rC
 mKJBKtIGnoIOPG06PT1k5qfdpzlreCFoWdIhjO55LfDgZnnDiMaX5h0vcBQ9xgCp
 xGV0n/f7+qn4pzB4hGvNV209Sdgv2V4t77bHNvyXlJrM5Hqzafo5MzFgEJv+fRqJ
 2RnkWVhwctfbid/2ggf2aAsYnMK3GigEaOCsYW2oWJESVUQhxIi3ndF/Jt9fraZv
 ZouD7G/s64P5lUQuCT9JnKGzJrSgxvkd37049AZ4pFVc2MzLC6o6dyyP8pu5ARe8
 T0shFik3+gsml2US/vSUzxvrg1saRQjl9E/AJ0RTZ8oyP4FNnFmkJf38qj3a0L0k
 ILFYscM5q7WPggoDA/m6F96tLGhdK/sKjDzrADjEh2dIvn4woqoEJSDn+rXuP+Gm
 UOj1v8mILZCqvOAmc9IkGCkPUlbrmNV/1FYh5+GWudtillEaD82vjSqm+jnVbfXD
 REvHlR/kxCSj1gg/+nk+NFdZCkW3xETOcTZohhDkR7du2mHjTwBMZ2YRPrqoX4c8
 VZA57Mrqm5Uk5601qYRl
 =L5e+
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-ia64_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 updates from Tony Luck:
 "The big change here is removal of support for SGI Altix"

* tag 'please-pull-ia64_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: (33 commits)
  genirq: remove the is_affinity_mask_valid hook
  ia64: remove CONFIG_SWIOTLB ifdefs
  ia64: remove support for machvecs
  ia64: move the screen_info setup to common code
  ia64: move the ROOT_DEV setup to common code
  ia64: rework iommu probing
  ia64: remove the unused sn_coherency_id symbol
  ia64: remove the SGI UV simulator support
  ia64: remove the zx1 swiotlb machvec
  ia64: remove CONFIG_ACPI ifdefs
  ia64: remove CONFIG_PCI ifdefs
  ia64: remove the hpsim platform
  ia64: remove now unused machvec indirections
  ia64: remove support for the SGI SN2 platform
  drivers: remove the SGI SN2 IOC4 base support
  drivers: remove the SGI SN2 IOC3 base support
  qla2xxx: remove SGI SN2 support
  qla1280: remove SGI SN2 support
  misc/sgi-xp: remove SGI SN2 support
  char/mspec: remove SGI SN2 support
  ...
2019-09-16 15:32:01 -07:00
Thomas Gleixner
9cc5b7fba5 irqchip updates for Linux 5.4
- Large GICv3 updates to support new PPI and SPI ranges
 - Conver all alloc_fwnode() users to use PAs instead of VAs
 - Add support for Marvell's MMP3 irqchip
 - Add support for Amlogic Meson SM1
 - Various cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl1yLdwPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDVHgP+wTAtNzxF4nRsJO3ZlbKe7n2ut7toZvqip7U
 97Ll652m/24stioU7b4/Z92E1fWIQe6Bv7u5detNQ4bIg2uwD6ZDYMjLLOXsJXrQ
 XH7pEAVe2eO0rjZudodUteyPGZQAS1/okY/4qsYNDiWKHi5xQPwMBNJ6fVwOVXUk
 HG7ClMny1i7SSRm3CGmW3Hxb279DrHOxCGPO6aEX5UQ/uZErCTJN5oFdLfAKvfaU
 qRpzetq4QOO+L7UBer6+jmtAvA/kuYwpGSnhJXhFOLMSgrGLQbINvYeAyv2vKGvF
 l6HQNuEf+OQjR5RcUZi6qbAO/WEh2xzmxpqta8COwjBFDcwsITyc5wNRNEfmNqO9
 Yrh3/a/J0eWX6X4rmmJkMDEBQvNbyA1//9d5NIjPhIVERZYPa4Z2lb220P4+58r7
 xnCqZnxZO78BhUd7HXeCluuTJs83iHxs03nAHysU9YZWkiF7jDXE0d3kla8cGcdY
 aOaSID0Q52flRxLymWJs29kyXZU8rpiPTbxvi86Ggkvgrj2pSwb8PF8LL4WPkzgh
 BcuF2ca9oa22OlA3oLV/sKwFRS77en+bpzLeU0NjeAcu5b8AY2nSSYXiuvtxIzi3
 FbQCsBBl7i5F5AQT9XGugjeNsCRUsUvIOP4f4w2Ej6HmqJb/SJVMBedMBsjGoEFG
 sj93Nc7f
 =PsGr
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates for Linux 5.4 from Marc Zyngier:

 - Large GICv3 updates to support new PPI and SPI ranges
 - Conver all alloc_fwnode() users to use PAs instead of VAs
 - Add support for Marvell's MMP3 irqchip
 - Add support for Amlogic Meson SM1
 - Various cleanups and fixes
2019-09-06 13:03:39 +02:00
Yunfeng Ye
eddf3e9c7c genirq: Prevent NULL pointer dereference in resend_irqs()
The following crash was observed:

  Unable to handle kernel NULL pointer dereference at 0000000000000158
  Internal error: Oops: 96000004 [#1] SMP
  pc : resend_irqs+0x68/0xb0
  lr : resend_irqs+0x64/0xb0
  ...
  Call trace:
   resend_irqs+0x68/0xb0
   tasklet_action_common.isra.6+0x84/0x138
   tasklet_action+0x2c/0x38
   __do_softirq+0x120/0x324
   run_ksoftirqd+0x44/0x60
   smpboot_thread_fn+0x1ac/0x1e8
   kthread+0x134/0x138
   ret_from_fork+0x10/0x18

The reason for this is that the interrupt resend mechanism happens in soft
interrupt context, which is a asynchronous mechanism versus other
operations on interrupts. free_irq() does not take resend handling into
account. Thus, the irq descriptor might be already freed before the resend
tasklet is executed. resend_irqs() does not check the return value of the
interrupt descriptor lookup and derefences the return value
unconditionally.

  1):
  __setup_irq
    irq_startup
      check_irq_resend  // activate softirq to handle resend irq
  2):
  irq_domain_free_irqs
    irq_free_descs
      free_desc
        call_rcu(&desc->rcu, delayed_free_desc)
  3):
  __do_softirq
    tasklet_action
      resend_irqs
        desc = irq_to_desc(irq)
        desc->handle_irq(desc)  // desc is NULL --> Ooops

Fix this by adding a NULL pointer check in resend_irqs() before derefencing
the irq descriptor.

Fixes: a4633adcdb ("[PATCH] genirq: add genirq sw IRQ-retrigger")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1630ae13-5c8e-901e-de09-e740b6a426a7@huawei.com
2019-09-05 21:31:14 +02:00
Dexuan Cui
711419e504 irqdomain: Add the missing assignment of domain->fwnode for named fwnode
Recently device pass-through stops working for Linux VM running on Hyper-V.

git-bisect shows the regression is caused by the recent commit
467a3bb974 ("PCI: hv: Allocate a named fwnode ..."), but the root cause
is that the commit d59f6617ee forgets to set the domain->fwnode for
IRQCHIP_FWNODE_NAMED*, and as a result:

1. The domain->fwnode remains to be NULL.

2. irq_find_matching_fwspec() returns NULL since "h->fwnode == fwnode" is
false, and pci_set_bus_msi_domain() sets the Hyper-V PCI root bus's
msi_domain to NULL.

3. When the device is added onto the root bus, the device's dev->msi_domain
is set to NULL in pci_set_msi_domain().

4. When a device driver tries to enable MSI-X, pci_msi_setup_msi_irqs()
calls arch_setup_msi_irqs(), which uses the native MSI chip (i.e.
arch/x86/kernel/apic/msi.c: pci_msi_controller) to set up the irqs, but
actually pci_msi_setup_msi_irqs() is supposed to call
msi_domain_alloc_irqs() with the hbus->irq_domain, which is created in
hv_pcie_init_irq_domain() and is associated with the Hyper-V chip
hv_msi_irq_chip. Consequently, the irq line is not properly set up, and
the device driver can not receive any interrupt.

Fixes: d59f6617ee ("genirq: Allow fwnode to carry name information only")
Fixes: 467a3bb974 ("PCI: hv: Allocate a named fwnode instead of an address-based one")
Reported-by: Lili Deng <v-lide@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/PU1P153MB01694D9AF625AC335C600C5FBFBE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM
2019-09-03 09:16:50 +01:00
Ming Lei
101f85b56d genirq/affinity: Remove const qualifier from node_to_cpumask argument
When CONFIG_CPUMASK_OFFSTACK isn't enabled, 'cpumask_var_t' is as

'typedef struct cpumask cpumask_var_t[1]',

so the argument 'node_to_cpumask' alloc_nodes_vectors() can't be declared
as 'const cpumask_var_t *'

Fixes the following warning:

   kernel/irq/affinity.c: In function '__irq_build_affinity_masks':
     alloc_nodes_vectors(numvecs, node_to_cpumask, cpu_mask,
                                  ^
   kernel/irq/affinity.c:128:13: note: expected 'const struct cpumask (*)[1]' but argument is of type 'struct cpumask (*)[1]'
    static void alloc_nodes_vectors(unsigned int numvecs,
                ^
Fixes: b1a5a73e64 ("genirq/affinity: Spread vectors on node according to nr_cpu ratio")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190828085815.19931-1-ming.lei@redhat.com
2019-08-28 12:20:43 +02:00
Ming Lei
b1a5a73e64 genirq/affinity: Spread vectors on node according to nr_cpu ratio
Now __irq_build_affinity_masks() spreads vectors evenly per node, but there
is a case that not all vectors have been spread when each numa node has a
different number of CPUs which triggers the warning in the spreading code.

Improve the spreading algorithm by

 - assigning vectors according to the ratio of the number of CPUs on a node
   to the number of remaining CPUs.

 - running the assignment from smaller nodes to bigger nodes to guarantee
   that every active node gets allocated at least one vector.

This ensures that all vectors are spread out. Asided of that the spread
becomes more fair if the nodes have different number of CPUs.

For example, on the following machine:
	CPU(s):              16
	On-line CPU(s) list: 0-15
	Thread(s) per core:  1
	Core(s) per socket:  8
	Socket(s):           2
	NUMA node(s):        2
	...
	NUMA node0 CPU(s):   0,1,3,5-9,11,13-15
	NUMA node1 CPU(s):   2,4,10,12

When a driver requests to allocate 8 vectors, the following spread results:

	irq 31, cpu list 2,4
	irq 32, cpu list 10,12
	irq 33, cpu list 0-1
	irq 34, cpu list 3,5
	irq 35, cpu list 6-7
	irq 36, cpu list 8-9
	irq 37, cpu list 11,13
	irq 38, cpu list 14-15

So Node 0 has now 6 and Node 1 has 2 vectors assigned. The original
algorithm assigned 4 vectors on each node which was unfair versus Node 0.

[ tglx: Massaged changelog ]

Reported-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Link: https://lkml.kernel.org/r/20190816022849.14075-3-ming.lei@redhat.com
2019-08-27 16:31:17 +02:00
Ming Lei
53c1788b7d genirq/affinity: Improve __irq_build_affinity_masks()
One invariant of __irq_build_affinity_masks() is that all CPUs in the
specified masks (cpu_mask AND node_to_cpumask for each node) should be
covered during the spread. Even though all requested vectors have been
reached, it's still required to spread vectors among remained CPUs. A
similar policy has been taken in case of 'numvecs <= nodes' already.

So remove the following check inside the loop:

	if (done >= numvecs)
		break;

Meantime assign at least 1 vector for remaining nodes if 'numvecs' vectors
have been handled already.

Also, if the specified cpumask for one numa node is empty, simply do not
spread vectors on this node.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190816022849.14075-2-ming.lei@redhat.com
2019-08-27 16:31:17 +02:00
Michael Kelley
d0ff14fdc9 genirq: Properly pair kobject_del() with kobject_add()
If alloc_descs() fails before irq_sysfs_init() has run, free_desc() in the
cleanup path will call kobject_del() even though the kobject has not been
added with kobject_add().

Fix this by making the call to kobject_del() conditional on whether
irq_sysfs_init() has run.

This problem surfaced because commit aa30f47cf6 ("kobject: Add support
for default attribute groups to kobj_type") makes kobject_del() stricter
about pairing with kobject_add(). If the pairing is incorrrect, a WARNING
and backtrace occur in sysfs_remove_group() because there is no parent.

[ tglx: Add a comment to the code and make it work with CONFIG_SYSFS=n ]

Fixes: ecb3f394c5 ("genirq: Expose interrupt information through sysfs")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564703564-4116-1-git-send-email-mikelley@microsoft.com
2019-08-19 21:41:19 +02:00
Thomas Gleixner
b6a32bbd87 genirq: Force interrupt threading on RT
Switch force_irqthreads from a boot time modifiable variable to a compile
time constant when CONFIG_PREEMPT_RT is enabled.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190816160923.12855-1-bigeasy@linutronix.de
2019-08-19 15:45:48 +02:00
Christoph Hellwig
0d3d343560 genirq: remove the is_affinity_mask_valid hook
This override was only used by the ia64 SGI SN2 platform, which is
gone now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lkml.kernel.org/r/20190813072514.23299-29-hch@lst.de
Signed-off-by: Tony Luck <tony.luck@intel.com>
2019-08-16 14:32:26 -07:00
Ming Lei
491beed3b1 genirq/affinity: Create affinity mask for single vector
Since commit c66d4bd110 ("genirq/affinity: Add new callback for
(re)calculating interrupt sets"), irq_create_affinity_masks() returns
NULL in case of single vector. This change has caused regression on some
drivers, such as lpfc.

The problem is that single vector requests can happen in some generic cases:

  1) kdump kernel

  2) irq vectors resource is close to exhaustion.

If in that situation the affinity mask for a single vector is not created,
every caller has to handle the special case.

There is no reason why the mask cannot be created, so remove the check for
a single vector and create the mask.

Fixes: c66d4bd110 ("genirq/affinity: Add new callback for (re)calculating interrupt sets")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190805011906.5020-1-ming.lei@redhat.com
2019-08-08 08:47:55 +02:00
Marc Zyngier
b977fcf477 irqdomain/debugfs: Use PAs to generate fwnode names
Booting a large arm64 server (HiSi D05) leads to the following
shouting at boot time:

[   20.722132] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[   20.730851] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[   20.739560] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[   20.748267] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[   20.756975] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[   20.765683] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!
[   20.774391] debugfs: File 'irqchip@(____ptrval____)-3' in directory 'domains' already present!

and many more... Evidently, we expect something a bit more informative
than ____ptrval____, and certainly we want all of our domains, not just
the first one.

For that, turn the %p used to generate the fwnode name into something
that won't be repainted (%pa). Given that we've now fixed all users to
pass a pointer to a PA, it will actually do the right thing.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-08-07 14:24:54 +01:00
Matthew Wilcox (Oracle)
7b3c92b85a sched/core: Convert get_task_struct() to return the task
Returning the pointer that was passed in allows us to write
slightly more idiomatic code.  Convert a few users.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190704221323.24290-1-willy@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-25 15:51:54 +02:00
Rafael J. Wysocki
3a79bc63d9 PCI: irq: Introduce rearm_wake_irq()
Introduce a new function, rearm_wake_irq(), allowing a wakeup IRQ
to be armed for systen wakeup detection again without running any
action handlers associated with it after it has been armed for
wakeup detection and triggered.

That is useful for IRQs, like ACPI SCI, that may deliver wakeup
as well as non-wakeup interrupts when armed for systen wakeup
detection.  In those cases, it may be possible to determine whether
or not the delivered interrupt is a systen wakeup one without
running the entire action handler (or handlers, if the IRQ is
shared) for the IRQ, and if the interrupt turns out to be a
non-wakeup one, the IRQ can be rearmed with the help of the
new function.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-23 09:37:51 +02:00
Linus Torvalds
0902d5011c Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x96 apic updates from Thomas Gleixner:
 "Updates for the x86 APIC interrupt handling and APIC timer:

   - Fix a long standing issue with spurious interrupts which was caused
     by the big vector management rework a few years ago. Robert Hodaszi
     provided finally enough debug data and an excellent initial failure
     analysis which allowed to understand the underlying issues.

     This contains a change to the core interrupt management code which
     is required to handle this correctly for the APIC/IO_APIC. The core
     changes are NOOPs for most architectures except ARM64. ARM64 is not
     impacted by the change as confirmed by Marc Zyngier.

   - Newer systems allow to disable the PIT clock for power saving
     causing panic in the timer interrupt delivery check of the IO/APIC
     when the HPET timer is not enabled either. While the clock could be
     turned on this would cause an endless whack a mole game to chase
     the proper register in each affected chipset.

     These systems provide the relevant frequencies for TSC, CPU and the
     local APIC timer via CPUID and/or MSRs, which allows to avoid the
     PIT/HPET based calibration. As the calibration code is the only
     usage of the legacy timers on modern systems and is skipped anyway
     when the frequencies are known already, there is no point in
     setting up the PIT and actually checking for the interrupt delivery
     via IO/APIC.

     To achieve this on a wide variety of platforms, the CPUID/MSR based
     frequency readout has been made more robust, which also allowed to
     remove quite some workarounds which turned out to be not longer
     required. Thanks to Daniel Drake for analysis, patches and
     verification"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Seperate unused system vectors from spurious entry again
  x86/irq: Handle spurious interrupt after shutdown gracefully
  x86/ioapic: Implement irq_get_irqchip_state() callback
  genirq: Add optional hardware synchronization for shutdown
  genirq: Fix misleading synchronize_irq() documentation
  genirq: Delay deactivation in free_irq()
  x86/timer: Skip PIT initialization on modern chipsets
  x86/apic: Use non-atomic operations when possible
  x86/apic: Make apic_bsp_setup() static
  x86/tsc: Set LAPIC timer period to crystal clock frequency
  x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period'
  x86/tsc: Use CPUID.0x16 to calculate missing crystal frequency
2019-07-08 11:22:57 -07:00
Linus Torvalds
2a1ccd3142 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The irq departement provides the usual mixed bag:

  Core:

   - Further improvements to the irq timings code which aims to predict
     the next interrupt for power state selection to achieve better
     latency/power balance

   - Add interrupt statistics to the core NMI handlers

   - The usual small fixes and cleanups

  Drivers:

   - Support for Renesas RZ/A1, Annapurna Labs FIC, Meson-G12A SoC and
     Amazon Gravition AMR/GIC interrupt controllers.

   - Rework of the Renesas INTC controller driver

   - ACPI support for Socionext SoCs

   - Enhancements to the CSKY interrupt controller

   - The usual small fixes and cleanups"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  irq/irqdomain: Fix comment typo
  genirq: Update irq stats from NMI handlers
  irqchip/gic-pm: Remove PM_CLK dependency
  irqchip/al-fic: Introduce Amazon's Annapurna Labs Fabric Interrupt Controller Driver
  dt-bindings: interrupt-controller: Add Amazon's Annapurna Labs FIC
  softirq: Use __this_cpu_write() in takeover_tasklets()
  irqchip/mbigen: Stop printing kernel addresses
  irqchip/gic: Add dependency for ARM_GIC_MAX_NR
  genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks()
  genirq/timings: Add selftest for next event computation
  genirq/timings: Add selftest for irqs circular buffer
  genirq/timings: Add selftest for circular array
  genirq/timings: Encapsulate storing function
  genirq/timings: Encapsulate timings push
  genirq/timings: Optimize the period detection speed
  genirq/timings: Fix timings buffer inspection
  genirq/timings: Fix next event index function
  irqchip/qcom: Use struct_size() in devm_kzalloc()
  irqchip/irq-csky-mpintc: Remove unnecessary loop in interrupt handler
  dt-bindings: interrupt-controller: Update csky mpintc
  ...
2019-07-08 11:01:13 -07:00
Zenghui Yu
3a1d24ca95 irq/irqdomain: Fix comment typo
Fix typo in the comment on top of __irq_domain_add().

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562388072-23492-1-git-send-email-yuzenghui@huawei.com
2019-07-06 10:40:20 +02:00
Shijith Thotton
c09cb12935 genirq: Update irq stats from NMI handlers
The NMI handlers handle_percpu_devid_fasteoi_nmi() and handle_fasteoi_nmi()
do not update the interrupt counts. Due to that the NMI interrupt count
does not show up correctly in /proc/interrupts.

Add the statistics and treat the NMI handlers in the same way as per cpu
interrupts and prevent them from updating irq_desc::tot_count as this might
be corrupted due to concurrency.

[ tglx: Massaged changelog ]

Fixes: 2dcf1fbcad ("genirq: Provide NMI handlers")
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562313336-11888-1-git-send-email-sthotton@marvell.com
2019-07-06 10:40:19 +02:00
Thomas Gleixner
62e0468650 genirq: Add optional hardware synchronization for shutdown
free_irq() ensures that no hardware interrupt handler is executing on a
different CPU before actually releasing resources and deactivating the
interrupt completely in a domain hierarchy.

But that does not catch the case where the interrupt is on flight at the
hardware level but not yet serviced by the target CPU. That creates an
interesing race condition:

   CPU 0                  CPU 1               IRQ CHIP

                                              interrupt is raised
                                              sent to CPU1
			  Unable to handle
			  immediately
			  (interrupts off,
			   deep idle delay)
   mask()
   ...
   free()
     shutdown()
     synchronize_irq()
     release_resources()
                          do_IRQ()
                            -> resources are not available

That might be harmless and just trigger a spurious interrupt warning, but
some interrupt chips might get into a wedged state.

Utilize the existing irq_get_irqchip_state() callback for the
synchronization in free_irq().

synchronize_hardirq() is not using this mechanism as it might actually
deadlock unter certain conditions, e.g. when called with interrupts
disabled and the target CPU is the one on which the synchronization is
invoked. synchronize_irq() uses it because that function cannot be called
from non preemtible contexts as it might sleep.

No functional change intended and according to Marc the existing GIC
implementations where the driver supports the callback should be able
to cope with that core change. Famous last words.

Fixes: 464d12309e ("x86/vector: Switch IOAPIC to global reservation mode")
Reported-by: Robert Hodaszi <Robert.Hodaszi@digi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.279463375@linutronix.de
2019-07-03 10:12:29 +02:00
Thomas Gleixner
1d21f2af85 genirq: Fix misleading synchronize_irq() documentation
The function might sleep, so it cannot be called from interrupt
context. Not even with care.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.189241552@linutronix.de
2019-07-03 10:12:29 +02:00
Thomas Gleixner
4001d8e876 genirq: Delay deactivation in free_irq()
When interrupts are shutdown, they are immediately deactivated in the
irqdomain hierarchy. While this looks obviously correct there is a subtle
issue:

There might be an interrupt in flight when free_irq() is invoking the
shutdown. This is properly handled at the irq descriptor / primary handler
level, but the deactivation might completely disable resources which are
required to acknowledge the interrupt.

Split the shutdown code and deactivate the interrupt after synchronization
in free_irq(). Fixup all other usage sites where this is not an issue to
invoke the combined shutdown_and_deactivate() function instead.

This still might be an issue if the interrupt in flight servicing is
delayed on a remote CPU beyond the invocation of synchronize_irq(), but
that cannot be handled at that level and needs to be handled in the
synchronize_irq() context.

Fixes: f8264e3496 ("irqdomain: Introduce new interfaces to support hierarchy irqdomains")
Reported-by: Robert Hodaszi <Robert.Hodaszi@digi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.098196390@linutronix.de
2019-07-03 10:12:28 +02:00
Julien Thierry
17ce302f31 arm64: Fix interrupt tracing in the presence of NMIs
In the presence of any form of instrumentation, nmi_enter() should be
done before calling any traceable code and any instrumentation code.

Currently, nmi_enter() is done in handle_domain_nmi(), which is much
too late as instrumentation code might get called before. Move the
nmi_enter/exit() calls to the arch IRQ vector handler.

On arm64, it is not possible to know if the IRQ vector handler was
called because of an NMI before acknowledging the interrupt. However, It
is possible to know whether normal interrupts could be taken in the
interrupted context (i.e. if taking an NMI in that context could
introduce a potential race condition).

When interrupting a context with IRQs disabled, call nmi_enter() as soon
as possible. In contexts with IRQs enabled, defer this to the interrupt
controller, which is in a better position to know if an interrupt taken
is an NMI.

Fixes: bc3c03ccb4 ("arm64: Enable the support of pseudo-NMIs")
Cc: <stable@vger.kernel.org> # 5.1.x-
Cc: Will Deacon <will.deacon@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-21 15:49:58 +01:00
Minwoo Im
0e51833042 genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks()
The *affd argument is neither used in irq_build_affinity_masks() nor
__irq_build_affinity_masks(). Remove it.

Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Minwoo Im <minwoo.im@samsung.com>
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602112117.31839-1-minwoo.im.dev@gmail.com
2019-06-12 10:52:45 +02:00
Daniel Lezcano
699785f5d8 genirq/timings: Add selftest for next event computation
The circular buffers are now validated with selftests. The next interrupt
index algorithm which is the hardest part to validate needs extra coverage.

Add a selftest which uses the intervals stored in the arrays and insert all
the values except the last one. The next event computation must return the
same value as the last element which was not inserted.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-9-daniel.lezcano@linaro.org
2019-06-12 10:47:05 +02:00
Daniel Lezcano
f52da98d90 genirq/timings: Add selftest for irqs circular buffer
After testing the per cpu interrupt circular event, make sure the per
interrupt circular buffer usage is correct.

Add tests to validate the interrupt circular buffer.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-8-daniel.lezcano@linaro.org
2019-06-12 10:47:05 +02:00
Daniel Lezcano
6aed82de71 genirq/timings: Add selftest for circular array
Due to the complexity of the code and the difficulty to debug it, add some
selftests to the framework in order to spot issues or regression at boot
time when the runtime testing is enabled for this subsystem.

This tests the circular buffer at the limits and validates:
 - the encoding / decoding of the values
 - the macro to browse the irq timings circular buffer
 - the function to push data in the circular buffer

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-7-daniel.lezcano@linaro.org
2019-06-12 10:47:04 +02:00
Daniel Lezcano
23aa3b9a6b genirq/timings: Encapsulate storing function
For the next patches providing the selftest, it is required to insert
interval values directly in the buffer in order to check the correctness of
the code. Encapsulate the code doing that in a always inline function in
order to reuse it in the test code.

No functional changes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-6-daniel.lezcano@linaro.org
2019-06-12 10:47:04 +02:00
Daniel Lezcano
df025e47e4 genirq/timings: Encapsulate timings push
For the next patches providing the selftest, it is required to artificially
insert timings value in the circular buffer in order to check the
correctness of the code. Encapsulate the common code between the future
test code and the current code with an always-inline tag.

No functional change.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-5-daniel.lezcano@linaro.org
2019-06-12 10:47:04 +02:00
Daniel Lezcano
3c2e79f4ce genirq/timings: Optimize the period detection speed
With a minimal period and if there is a period which is a multiple of it
but lesser than the max period then it will be detected before and the
minimal period will be never reached.

 1 2 1 2 1 2 1 2 1 2 1 2
 <-----> <-----> <----->
 <-> <-> <-> <-> <-> <->

In that case, the minimum period is 2 and the maximum period is 5. That
means all repeating pattern of 2 will be detected as repeating pattern of
4, it is pointless to go up to 2 when searching for the period as it will
always fail.

Remove one loop iteration by increasing the minimal period to 3.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-4-daniel.lezcano@linaro.org
2019-06-12 10:47:03 +02:00
Daniel Lezcano
2840eef051 genirq/timings: Fix timings buffer inspection
It appears the index beginning computation is not correct, the current
code does:

     i = (irqts->count & IRQ_TIMINGS_MASK) - 1

If irqts->count is equal to zero, we end up with an index equal to -1,
but that does not happen because the function checks against zero
before and returns in such case.

However, if irqts->count is a multiple of IRQ_TIMINGS_SIZE, the
resulting & bit op will be zero and leads also to a -1 index.

Re-introduce the iteration loop belonging to the previous variance
code which was correct.

Fixes: bbba0e7c5c "genirq/timings: Add array suffix computation code"
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-3-daniel.lezcano@linaro.org
2019-06-12 10:47:03 +02:00
Daniel Lezcano
619c1baa91 genirq/timings: Fix next event index function
The current code is luckily working with most of the interval samples
testing but actually it fails to correctly detect pattern repetition
breaking at the end of the buffer.

Narrowing down the bug has been a real pain because of the pointers,
so the routine is rewrittne by using indexes instead.

Fixes: bbba0e7c5c "genirq/timings: Add array suffix computation code"
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20190527205521.12091-2-daniel.lezcano@linaro.org
2019-06-12 10:47:03 +02:00
Geert Uytterhoeven
43b98d876f genirq/irqdomain: Remove WARN_ON() on out-of-memory condition
There is no need to print a backtrace when memory allocation fails, as
the memory allocation core already takes care of that.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190527115742.2693-1-geert+renesas@glider.be
2019-05-28 13:10:55 -07:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Linus Torvalds
d9351ea14d Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull IRQ chip updates from Ingo Molnar:
 "A late irqchips update:

   - New TI INTR/INTA set of drivers

   - Rewrite of the stm32mp1-exti driver as a platform driver

   - Update the IOMMU MSI mapping API to be RT friendly

   - A number of cleanups and other low impact fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg()
  irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg()
  irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()
  iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  arm64: arch_k3: Enable interrupt controller drivers
  irqchip/ti-sci-inta: Add msi domain support
  soc: ti: Add MSI domain bus support for Interrupt Aggregator
  irqchip/ti-sci-inta: Add support for Interrupt Aggregator driver
  dt-bindings: irqchip: Introduce TISCI Interrupt Aggregator bindings
  irqchip/ti-sci-intr: Add support for Interrupt Router driver
  dt-bindings: irqchip: Introduce TISCI Interrupt router bindings
  gpio: thunderx: Use the default parent apis for {request,release}_resources
  genirq: Introduce irq_chip_{request,release}_resource_parent() apis
  firmware: ti_sci: Add helper apis to manage resources
  firmware: ti_sci: Add RM mapping table for am654
  firmware: ti_sci: Add support for IRQ management
  firmware: ti_sci: Add support for RM core ops
  ...
2019-05-19 10:58:45 -07:00
Linus Torvalds
cf482a49af Driver core/kobject patches for 5.2-rc1
Here is the "big" set of driver core patches for 5.2-rc1
 
 There are a number of ACPI patches in here as well, as Rafael said they
 should go through this tree due to the driver core changes they
 required.  They have all been acked by the ACPI developers.
 
 There are also a number of small subsystem-specific changes in here, due
 to some changes to the kobject core code.  Those too have all been acked
 by the various subsystem maintainers.
 
 As for content, it's pretty boring outside of the ACPI changes:
   - spdx cleanups
   - kobject documentation updates
   - default attribute groups for kobjects
   - other minor kobject/driver core fixes
 
 All have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXNHDbw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynDAgCfbb4LBR6I50wFXb8JM/R6cAS7qrsAn1unshKV
 8XCYcif2RxjtdJWXbjdm
 =/rLh
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core/kobject updates from Greg KH:
 "Here is the "big" set of driver core patches for 5.2-rc1

  There are a number of ACPI patches in here as well, as Rafael said
  they should go through this tree due to the driver core changes they
  required. They have all been acked by the ACPI developers.

  There are also a number of small subsystem-specific changes in here,
  due to some changes to the kobject core code. Those too have all been
  acked by the various subsystem maintainers.

  As for content, it's pretty boring outside of the ACPI changes:
   - spdx cleanups
   - kobject documentation updates
   - default attribute groups for kobjects
   - other minor kobject/driver core fixes

  All have been in linux-next for a while with no reported issues"

* tag 'driver-core-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (47 commits)
  kobject: clean up the kobject add documentation a bit more
  kobject: Fix kernel-doc comment first line
  kobject: Remove docstring reference to kset
  firmware_loader: Fix a typo ("syfs" -> "sysfs")
  kobject: fix dereference before null check on kobj
  Revert "driver core: platform: Fix the usage of platform device name(pdev->name)"
  init/config: Do not select BUILD_BIN2C for IKCONFIG
  Provide in-kernel headers to make extending kernel easier
  kobject: Improve doc clarity kobject_init_and_add()
  kobject: Improve docs for kobject_add/del
  driver core: platform: Fix the usage of platform device name(pdev->name)
  livepatch: Replace klp_ktype_patch's default_attrs with groups
  cpufreq: schedutil: Replace default_attrs field with groups
  padata: Replace padata_attr_type default_attrs field with groups
  irqdesc: Replace irq_kobj_type's default_attrs field with groups
  net-sysfs: Replace ktype default_attrs field with groups
  block: Replace all ktype default_attrs with groups
  samples/kobject: Replace foo_ktype's default_attrs field with groups
  kobject: Add support for default attribute groups to kobj_type
  driver core: Postpone DMA tear-down until after devres release for probe failure
  ...
2019-05-07 13:01:40 -07:00
Linus Torvalds
0968621917 Printk changes for 5.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAlzP8nQACgkQUqAMR0iA
 lPK79A/+NkRouqA9ihAZhUbgW0DHzOAFvUJSBgX11HQAZbGjngakuoyYFvwUx0T0
 m80SUTCysxQrWl+xLdccPZ9ZrhP2KFQrEBEdeYHZ6ymcYcl83+3bOIBS7VwdZAbO
 EzB8u/58uU/sI6ABL4lF7ZF/+R+U4CXveEUoVUF04bxdPOxZkRX4PT8u3DzCc+RK
 r4yhwQUXGcKrHa2GrRL3GXKsDxcnRdFef/nzq4RFSZsi0bpskzEj34WrvctV6j+k
 FH/R3kEcZrtKIMPOCoDMMWq07yNqK/QKj0MJlGoAlwfK4INgcrSXLOx+pAmr6BNq
 uMKpkxCFhnkZVKgA/GbKEGzFf+ZGz9+2trSFka9LD2Ig6DIstwXqpAgiUK8JFQYj
 lq1mTaJZD3DfF2vnGHGeAfBFG3XETv+mIT/ow6BcZi3NyNSVIaqa5GAR+lMc6xkR
 waNkcMDkzLFuP1r0p7ZizXOksk9dFkMP3M6KqJomRtApwbSNmtt+O2jvyLPvB3+w
 wRyN9WT7IJZYo4v0rrD5Bl6BjV15ZeCPRSFZRYofX+vhcqJQsFX1M9DeoNqokh55
 Cri8f6MxGzBVjE1G70y2/cAFFvKEKJud0NUIMEuIbcy+xNrEAWPF8JhiwpKKnU10
 c0u674iqHJ2HeVsYWZF0zqzqQ6E1Idhg/PrXfuVuhAaL5jIOnYY=
 =WZfC
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk

Pull printk updates from Petr Mladek:

 - Allow state reset of printk_once() calls.

 - Prevent crashes when dereferencing invalid pointers in vsprintf().
   Only the first byte is checked for simplicity.

 - Make vsprintf warnings consistent and inlined.

 - Treewide conversion of obsolete %pf, %pF to %ps, %pF printf
   modifiers.

 - Some clean up of vsprintf and test_printf code.

* tag 'printk-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
  lib/vsprintf: Make function pointer_string static
  vsprintf: Limit the length of inlined error messages
  vsprintf: Avoid confusion between invalid address and value
  vsprintf: Prevent crash when dereferencing invalid pointers
  vsprintf: Consolidate handling of unknown pointer specifiers
  vsprintf: Factor out %pO handler as kobject_string()
  vsprintf: Factor out %pV handler as va_format()
  vsprintf: Factor out %p[iI] handler as ip_addr_string()
  vsprintf: Do not check address of well-known strings
  vsprintf: Consistent %pK handling for kptr_restrict == 0
  vsprintf: Shuffle restricted_pointer()
  printk: Tie printk_once / printk_deferred_once into .data.once for reset
  treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively
  lib/test_printf: Switch to bitmap_zalloc()
2019-05-07 09:18:12 -07:00
Linus Torvalds
2f1835dffa Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Ingo Molnar:
 "The changes in this cycle were:

   - Remove the irq timings/variance statistics code that tried to
     predict when the next interrupt would occur, which didn't work out
     as hoped and is replaced by another mechanism.

   - This new mechanism is the 'array suffix computation' estimate,
     which is superior to the previous one as it can detect not just a
     single periodic pattern, but independent periodic patterns along a
     log-2 scale of bucketing and exponential moving average. The
     comments are longer than the code - and it works better at
     predicting various complex interrupt patterns from real-world
     devices than the previous estimate.

   - avoid IRQ-work self-IPIs on the local CPU

   - fix work-list corruption in irq_set_affinity_notifier()"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irq_work: Do not raise an IPI when queueing work on the local CPU
  genirq/devres: Use struct_size() in devm_kzalloc()
  genirq/timings: Add array suffix computation code
  genirq/timings: Remove variance computation code
  genirq: Prevent use-after-free and work list corruption
2019-05-06 13:45:04 -07:00
Thomas Gleixner
fb4e059265 irqchip updates for 5.2
- The huge (and terrifying) TI INTR/INTA set of drivers
 - Rewrite of the stm32mp1-exti driver as a platform driver
 - Update the IOMMU MSI mapping API to be RT friendly
 - A number of cleanups and other low impact fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlzMUswVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDOUUQALpVvlxPA2jXmlYTZ9/1I4ehSv78
 SFHsLHybvbdt0+zZDQE6wIS/OWc8fi8LRHtUCkNOygWMk8Ae8D8IeC6VwbPxhbQa
 RNvIULOC7RqapcWXmF3pYz+tfJehLPSmfTG3yGczXyMHSF6ypHKxX8+RIr6ZCeqH
 U2XVa7ELhWxeTuA52oxEiGymPGBZTEcMGcgGzhRxmDE9GVRw5XWrnKsG0nNCklMK
 DSKPoUtmKny42LIfVAFgS2PD081IRRQZII/j/mbA/NG5KNNQutmkZLJgFywjeJxu
 AOhYHmBiqyEfQs5S02i0GXMUITBvWsi9dPgjnwd2aRsPQ8nzXx2mUH4FQz9fqmHt
 ZxGsE1rLYlB0BS2h1Ap4bpHDjsgx/MvPYdarkN966/T6DHM6UYfdx9wnkVtH3oLJ
 Lg4UT+MWbz2f70JJ1jy5NWMDLVEyL7ERoZLivhxouXqAuGK7x3BUFPXXwTbep7Me
 E+N45kYycryCUjrt12EQ7PX/1W7KXe9Z7UX63VUsGmCxaPDgZ5T/ofybD4vJEIQ0
 fyaOK9jxBuTeVokFecCMxacWfRtQ3trLljlEJY5/1NZMXEmLwFbo0sy7458zIoss
 aTV+bk+N+xrGERRpswjGhhdRxjM41EcoiuJs52L9EL1IB/50ye6ENOPtGIDfuWXT
 XEm5jXGM4TVTOkf/
 =vP2/
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier

- The huge (and terrifying) TI INTR/INTA set of drivers
- Rewrite of the stm32mp1-exti driver as a platform driver
- Update the IOMMU MSI mapping API to be RT friendly
- A number of cleanups and other low impact fixes
2019-05-06 12:04:12 +02:00
Julien Grall
aaebdf8d68 genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03 15:19:20 +01:00
Lokesh Vutla
2bd1298ac1 genirq: Introduce irq_chip_{request,release}_resource_parent() apis
Introduce irq_chip_{request,release}_resource_parent() apis so
that these can be used in hierarchical irqchips.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01 10:41:38 +01:00
Julien Grall
08970ecf74 irq/irqdomain: Fix typo in the comment on top of __irq_domain_alloc_irqs()
The word 'number' has been misspelt in the comment on top of
_irq_domain_alloc_irqs().

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29 15:45:00 +01:00
Kimberly Brown
52ba92f588 irqdesc: Replace irq_kobj_type's default_attrs field with groups
The kobj_type default_attrs field is being replaced by the
default_groups field. Replace irq_kobj_type's default_attrs field with
default_groups and use the ATTRIBUTE_GROUPS macro to create irq_groups.

This patch was tested by verifying that the sysfs files for the
attributes in the default groups were created.

Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 22:06:11 +02:00
Gustavo A. R. Silva
2d65c42b43 genirq/devres: Use struct_size() in devm_kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
    int stuff;
    struct boo entry[];
};

size = sizeof(struct foo) + count * sizeof(struct boo);
instance = devm_kzalloc(dev, size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper.

instance = devm_kzalloc(dev, struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190410170914.GA16161@embeddedor
2019-04-16 21:54:03 +02:00
Sakari Ailus
d75f773c86 treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively
%pF and %pf are functionally equivalent to %pS and %ps conversion
specifiers. The former are deprecated, therefore switch the current users
to use the preferred variant.

The changes have been produced by the following command:

	git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
	while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done

And verifying the result.

Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
Cc: linux-acpi@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-nvdimm@lists.01.org
Cc: linux-pci@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-btrfs@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-mm@kvack.org
Cc: ceph-devel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: David Sterba <dsterba@suse.com> (for btrfs)
Acked-by: Mike Rapoport <rppt@linux.ibm.com> (for mm/memblock.c)
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (for drivers/pci)
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2019-04-09 14:19:06 +02:00
Daniel Lezcano
bbba0e7c5c genirq/timings: Add array suffix computation code
The previous approach based on the variance was discarding values from
the timings when they were considered as anomalies as stated by the
normal law statistical model.

However in the interrupt life, there can be multiple anomalies due to the
nature of the device generating the interrupts, and most of the time a
repeating pattern can be observed, that is particulary true for network,
console, MMC or SSD devices.

The variance approach missed the patterns and it was only able to deal with
the interrupt coming in regular intervals, thus reducing considerably the
scope of what is predictable.

In order to find out the repeating patterns, the interrupt intervals are
grouped in a ilog2 basis to create a suite of numbers with small
amplitude. Every group contains an exponential moving average of the values
belonging to the group. The array suffix, a data structure used for string
searching, data compression, etc ..., is built from the suite of numbers
and the suffixes are then searched in this suite.

The tests showed the algorithm is able to find all repeating patterns,
as well as regular interval in less than 1us on x86-i7.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: rjw@rjwysocki.net
Cc: ulf.hansson@linaro.org
Cc: linux-pm@vger.kernel.org
Link: https://lkml.kernel.org/r/20190328151336.5316-2-daniel.lezcano@linaro.org
2019-04-05 22:51:29 +02:00
Daniel Lezcano
bfe8384498 genirq/timings: Remove variance computation code
The variance computation did not provide the expected results and will be
replaced with a different approach to compute the next interrupt based on
the array suffixes derived algorithm.

There is no good way to transform the variance code to the new algorithm,
so for ease of review remove the existing code first.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: rjw@rjwysocki.net
Cc: ulf.hansson@linaro.org
Cc: linux-pm@vger.kernel.org
Link: https://lkml.kernel.org/r/20190328151336.5316-1-daniel.lezcano@linaro.org
2019-04-05 22:51:29 +02:00
Stephen Boyd
325aa19598 genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
If a child irqchip calls irq_chip_set_wake_parent() but its parent irqchip
has the IRQCHIP_SKIP_SET_WAKE flag set an error is returned.

This is inconsistent behaviour vs. set_irq_wake_real() which returns 0 when
the irqchip has the IRQCHIP_SKIP_SET_WAKE flag set. It doesn't attempt to
walk the chain of parents and set irq wake on any chips that don't have the
flag set either. If the intent is to call the .irq_set_wake() callback of
the parent irqchip, then we expect irqchip implementations to omit the
IRQCHIP_SKIP_SET_WAKE flag and implement an .irq_set_wake() function that
calls irq_chip_set_wake_parent().

The problem has been observed on a Qualcomm sdm845 device where set wake
fails on any GPIO interrupts after applying work in progress wakeup irq
patches to the GPIO driver. The chain of chips looks like this:

     QCOM GPIO -> QCOM PDC (SKIP) -> ARM GIC (SKIP)

The GPIO controllers parent is the QCOM PDC irqchip which in turn has ARM
GIC as parent.  The QCOM PDC irqchip has the IRQCHIP_SKIP_SET_WAKE flag
set, and so does the grandparent ARM GIC.

The GPIO driver doesn't know if the parent needs to set wake or not, so it
unconditionally calls irq_chip_set_wake_parent() causing this function to
return a failure because the parent irqchip (PDC) doesn't have the
.irq_set_wake() callback set. Returning 0 instead makes everything work and
irqs from the GPIO controller can be configured for wakeup.

Make it consistent by returning 0 (success) from irq_chip_set_wake_parent()
when a parent chip has IRQCHIP_SKIP_SET_WAKE set.

[ tglx: Massaged changelog ]

Fixes: 08b55e2a92 ("genirq: Add irqchip_set_wake_parent")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-gpio@vger.kernel.org
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190325181026.247796-1-swboyd@chromium.org
2019-04-05 17:41:41 +02:00
Kefeng Wang
e8458e7afa genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
When CONFIG_SPARSE_IRQ is disable, the request_mutex in struct irq_desc
is not initialized which causes malfunction.

Fixes: 9114014cf4 ("genirq: Add mutex to irq desc to serialize request/free_irq()")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190404074512.145533-1-wangkefeng.wang@huawei.com
2019-04-05 14:37:56 +02:00
Prasad Sodagudi
59c39840f5 genirq: Prevent use-after-free and work list corruption
When irq_set_affinity_notifier() replaces the notifier, then the
reference count on the old notifier is dropped which causes it to be
freed. But nothing ensures that the old notifier is not longer queued
in the work list. If it is queued this results in a use after free and
possibly in work list corruption.

Ensure that the work is canceled before the reference is dropped.

Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: marc.zyngier@arm.com
Link: https://lkml.kernel.org/r/1553439424-6529-1-git-send-email-psodagud@codeaurora.org
2019-03-24 22:13:17 +01:00
Gustavo A. R. Silva
93417a3fda genirq: Mark expected switch case fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

With -Wimplicit-fallthrough added to CFLAGS:

 kernel/irq/manage.c: In function ‘irq_do_set_affinity’:
 kernel/irq/manage.c:198:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   cpumask_copy(desc->irq_common_data.affinity, mask);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 kernel/irq/manage.c:199:2: note: here
   case IRQ_SET_MASK_OK_NOCOPY:
   ^~~~

Annotate it.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20190228213714.GA9246@embeddedor
2019-03-23 12:32:01 +01:00
Valdis Kletnieks
bb2e320565 genirq/devres: Remove excess parameter from kernel doc
Building with 'make W=1' complains:

  CC      kernel/irq/devres.o
kernel/irq/devres.c:104: warning: Excess function parameter 'thread_fn'
			 description in 'devm_request_any_context_irq'

Remove it.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/31207.1552378676@turing-police
2019-03-22 13:34:12 +01:00
Linus Torvalds
3601fe43e8 This is the bulk of GPIO changes for the v5.1 cycle:
Core changes:
 
 - The big change this time around is the irqchip handling in
   the qualcomm pin controllers, closely coupled with the
   gpiochip. This rework, in a classic fall-between-the-chairs
   fashion has been sidestepped for too long. The Qualcomm
   IRQchips using the SPMI and SSBI transport mechanisms have
   been rewritten to use hierarchical irqchip. This creates
   the base from which I intend to gradually pull support for
   hierarchical irqchips into the gpiolib irqchip helpers to
   cut down on duplicate code. We have too many hacks in the
   kernel because people have been working around the missing
   hierarchical irqchip for years, and once it was there,
   noone understood it for a while. We are now slowly adapting
   to using it. This is why this pull requests include changes
   to MFD, SPMI, IRQchip core and some ARM Device Trees
   pertaining to the Qualcomm chip family. Since Qualcomm have
   so many chips and such large deployments it is paramount
   that this platform gets this right, and now it (hopefully)
   does.
 
 - Core support for pull-up and pull-down configuration, also
   from the device tree. When a simple GPIO chip support a
   "off or on" pull-up or pull-down resistor, we provide a
   way to set this up using machine descriptors or device tree.
   If more elaborate control of pull up/down (such as
   resistance shunt setting) is required, drivers should be
   phased over to use pin control. We do not yet provide a
   userspace ABI for this pull up-down setting but I suspect
   the makers are going to ask for it soon enough. PCA953x
   is the first user of this new API.
 
 - The GPIO mockup driver has been revamped after some
   discussion improving the IRQ simulator in the process.
   The idea is to make it possible to use the mockup for
   both testing and virtual prototyping, e.g. when you do
   not yet have a GPIO expander to play with but really
   want to get something to develop code around before
   hardware is available. It's neat. The blackbox testing
   usecase is currently making its way into kernelci.
 
 - ACPI GPIO core preserves non direction flags when updating
   flags.
 
 - A new device core helper for devm_platform_ioremap_resource()
   is funneled through the GPIO tree with Greg's ACK.
 
 New drivers:
 
 - TQ-Systems QTMX86 GPIO controllers (using port-mapped
   I/O)
 
 - Gateworks PLD GPIO driver (vaccumed up from OpenWrt)
 
 - AMD G-Series PCH (Platform Controller Hub) GPIO driver.
 
 - Fintek F81804 & F81966 subvariants.
 
 - PCA953x now supports NXP PCAL6416.
 
 Driver improvements:
 
 - IRQ support on the Nintendo Wii (Hollywood) GPIO.
 
 - get_direction() support for the MVEBU driver.
 
 - Set the right output level on SAMA5D2.
 
 - Drop the unused irq trigger setting on the Spreadtrum
   driver.
 
 - Wakeup support for PCA953x.
 
 - A slew of cleanups in the various Intel drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcgoLEAAoJEEEQszewGV1zjBAP/3OmTFGv49PFmJwSx+PlLiYf
 V6/UPaQzq81CGSMtHxbS51TyP9Id7PCfsacbuFYutzn0D1efvl7jrkb8qJ6fVvCM
 bl/i6q8ipRTPzAf1hD3QCgCe3BXCA064/OcPrz987oIvI3bJQXsmBjBSXHWr4Cwa
 WfB5DX/afn9TK3XHhMQGfw5f0d+TtnKAs90RTTVKiz9Ow8eFYZJOhgPkvhCR3Gi9
 YJIzIAiwhHZ7/zauo4JAYFU/O/Z3YEC5zeLne2ItebzNooRkSxdz0c9Hs7HlCZmU
 930Uv9jNN89N3vPqpZzAHtPvwDOmAILMWvKy9xRSp+eoIukarRJgF7ALPk7QWxK1
 yy+tGj4dXBQ6tI8W3wUN1WgjNpii3K1HbJ+1LQVQL2/q9o+3YXXqmjdjuw7C8YYV
 5ystNrUppkgfIIciHL4lhqw3wKJJhVEAns2V245hIitoShT+RvIg8GQbGZmWlQFd
 YsHbynqHL9iwfRNv26kEqZXZOo/4D1t6Scw+OPVyba2Wyttf+qbmg+XaYMqFaxYW
 mfydvdtymeCOUIPJMzw58KGPUTXJ4UPLENyayXNUHokr1a8VO8OIthY7zwi0CpvJ
 IcsAY9zoGxvfbRV922mlIsw3oOBcM2IN2lC9sY469ZVnjBrdC3rsQpIBZr+Vzz8i
 YlUfXLSGSyuUZUz//2eG
 =VoVC
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO updates from Linus Walleij:
 "This is the bulk of GPIO changes for the v5.1 cycle:

  Core changes:

   - The big change this time around is the irqchip handling in the
     qualcomm pin controllers, closely coupled with the gpiochip. This
     rework, in a classic fall-between-the-chairs fashion has been
     sidestepped for too long.

     The Qualcomm IRQchips using the SPMI and SSBI transport mechanisms
     have been rewritten to use hierarchical irqchip. This creates the
     base from which I intend to gradually pull support for hierarchical
     irqchips into the gpiolib irqchip helpers to cut down on duplicate
     code.

     We have too many hacks in the kernel because people have been
     working around the missing hierarchical irqchip for years, and once
     it was there, noone understood it for a while. We are now slowly
     adapting to using it.

     This is why this pull requests include changes to MFD, SPMI,
     IRQchip core and some ARM Device Trees pertaining to the Qualcomm
     chip family. Since Qualcomm have so many chips and such large
     deployments it is paramount that this platform gets this right, and
     now it (hopefully) does.

   - Core support for pull-up and pull-down configuration, also from the
     device tree. When a simple GPIO chip supports an "off or on" pull-up
     or pull-down resistor, we provide a way to set this up using
     machine descriptors or device tree.

     If more elaborate control of pull up/down (such as resistance shunt
     setting) is required, drivers should be phased over to use pin
     control. We do not yet provide a userspace ABI for this pull
     up-down setting but I suspect the makers are going to ask for it
     soon enough. PCA953x is the first user of this new API.

   - The GPIO mockup driver has been revamped after some discussion
     improving the IRQ simulator in the process.

     The idea is to make it possible to use the mockup for both testing
     and virtual prototyping, e.g. when you do not yet have a GPIO
     expander to play with but really want to get something to develop
     code around before hardware is available. It's neat. The blackbox
     testing usecase is currently making its way into kernelci.

   - ACPI GPIO core preserves non direction flags when updating flags.

   - A new device core helper for devm_platform_ioremap_resource() is
     funneled through the GPIO tree with Greg's ACK.

  New drivers:

   - TQ-Systems QTMX86 GPIO controllers (using port-mapped I/O)

   - Gateworks PLD GPIO driver (vaccumed up from OpenWrt)

   - AMD G-Series PCH (Platform Controller Hub) GPIO driver.

   - Fintek F81804 & F81966 subvariants.

   - PCA953x now supports NXP PCAL6416.

  Driver improvements:

   - IRQ support on the Nintendo Wii (Hollywood) GPIO.

   - get_direction() support for the MVEBU driver.

   - Set the right output level on SAMA5D2.

   - Drop the unused irq trigger setting on the Spreadtrum driver.

   - Wakeup support for PCA953x.

   - A slew of cleanups in the various Intel drivers"

* tag 'gpio-v5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (110 commits)
  gpio: gpio-omap: fix level interrupt idling
  gpio: amd-fch: Set proper output level for direction_output
  x86: apuv2: remove unused variable
  gpio: pca953x: Use PCA_LATCH_INT
  platform/x86: fix PCENGINES_APU2 Kconfig warning
  gpio: pca953x: Fix dereference of irq data in shutdown
  gpio: amd-fch: Fix type error found by sparse
  gpio: amd-fch: Drop const from resource
  gpio: mxc: add check to return defer probe if clock tree NOT ready
  gpio: ftgpio: Register per-instance irqchip
  gpio: ixp4xx: Add DT bindings
  x86: pcengines apuv2 gpio/leds/keys platform driver
  gpio: AMD G-Series PCH gpio driver
  drivers: depend on HAS_IOMEM for devm_platform_ioremap_resource()
  gpio: tqmx86: Set proper output level for direction_output
  gpio: sprd: Change to use SoC compatible string
  gpio: sprd: Use SoC compatible string instead of wildcard string
  gpio: of: Handle both enable-gpio{,s}
  gpio: of: Restrict enable-gpio quirk to regulator-gpio
  gpio: davinci: use devm_platform_ioremap_resource()
  ...
2019-03-08 10:09:53 -08:00
Thomas Gleixner
a324ca9cad irqchip updates for Linux 5.1
- Core pseudo-NMI handling code
 - Allow the default irq domain to be retrieved
 - A new interrupt controller for the Loongson LS1X platform
 - Affinity support for the SiFive PLIC
 - Better support for the iMX irqsteer driver
 - NUMA aware memory allocations for GICv3
 - A handful of other fixes (i8259, GICv3, PLIC)
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlxwGtgVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpD+2YP/2m9cVU3Z9ak8+HdSblq2Sw8QPfd
 RshYS+DzppLUzhzj2w2jnz9eP2fWEqBwrQmvtOI8Fo+id0PvdE3ngaP4hPMJDyuU
 Ou02TV6YwE4jknoO02RXOdeBJArccc1WR5++YZjp1gGUABFUPCHwKLoZgysurapV
 sZQ1Ten3wlsrZKKNTdWfYFWB36d7J3eqFYeGy3sll1wQ6XUbHmUJPPrSfXMqDYzY
 giDD/DH8IIhfnRs+T2TxGzKtTDMnJRYJYQK2bNgtNAW+wEY2BtCLSHj8//3bK0R9
 Jek9xg1NLpbQE+T8f2ZUd6BjbVxmDd3mGPvshXKyHFESl4fvC9yrddC86dBzHwrN
 VJmaES974PBuMtE2xPZGInh77EcelVC7OPeXsnjVMrUZo0s7tFY/TWA+rqCOLmgC
 A+0jagCDx1nTTYGXsqoyrHThoQoYZRX6AnXFeDJb9OLo3cV7x4w/FPORstM0PbAc
 butyZulVg1YQ+Y+oJK/UvIkdFL7FFqB/kgZK/lrL0InvbQMj4CBt3bsWY5OxgInF
 E02tgzEnrx1nHGi1XPnCTOs7DnKeaPR/h/u3PjoT7FeiZLClyiGDw7V/NuF+buLB
 w7Pqpn835CnkXC27MycTjPo23eZv690M4vcHL4vrhN+iuGp+2hZdXUiR15mZnH6m
 g0N8anZbL1iol0Gm
 =M6YA
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier

- Core pseudo-NMI handling code
- Allow the default irq domain to be retrieved
- A new interrupt controller for the Loongson LS1X platform
- Affinity support for the SiFive PLIC
- Better support for the iMX irqsteer driver
- NUMA aware memory allocations for GICv3
- A handful of other fixes (i8259, GICv3, PLIC)
2019-02-23 10:53:31 +01:00
Linus Walleij
3dda927fdb Merge branch 'ib-qcom-ssbi' into devel 2019-02-21 12:58:31 +01:00
Marc Zyngier
9f199dd34c irqdomain: Allow the default irq domain to be retrieved
The default irq domain allows legacy code to create irqdomain
mappings without having to track the domain it is allocating
from. Setting the default domain is a one shot, fire and forget
operation, and no effort was made to be able to retrieve this
information at a later point in time.

Newer irqdomain APIs (the hierarchical stuff) relies on both
the irqchip code to track the irqdomain it is allocating from,
as well as some form of firmware abstraction to easily identify
which piece of HW maps to which irq domain (DT, ACPI).

For systems without such firmware (or legacy platform that are
getting dragged into the 21st century), things are a bit harder.
For these cases (and these cases only!), let's provide a way
to retrieve the default domain, allowing the use of the v2 API
without having to resort to platform-specific hacks.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-21 10:32:28 +00:00
Linus Walleij
2f7db3c70f gpio: updates for v5.1 - part 2
- gpio-mockup updates improving the user-space testing interface and
   adding line state tracking for correct edge interrupts
 - interrupt simulator patch exposing the irq type configuration to
   users
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlxsN0kACgkQEacuoBRx
 13KDpQ/9GJvxpPn81vm8z8+IK+qQcbZ55lKIB90FB6kz0+ru6g+9gXWSA3FAifOr
 8ZxqwALIx52rLDYXpN0gQbOz4bIIXRm6eSKwm6QHyWlsW+59wvS4kAhu4a1j5jmy
 /jJlTQc+zOIkoYJX1EdRn581ehsmwftWm1kBIMudybC99vq3ks3a7nJjcNL/OYYo
 quvsVabB2n2/At4g4SfP4BRA1Hfgb1+X8rUcqHiIKlYHy6bVggrLLOcyEAwECoIT
 uvmXNGGD5g8W//sTi/Ex8xmR9xSdF3tI3PQQODvrRU4nbp7gYOIP7qFzUrBeZ+dP
 tRWmy0FVj6DaHbm9SBhVa/i8na7K04ibUAr/oknrPkBc55eSf3A9SWuqDa7HoC+L
 voFlndfhx8l8yEMQAui+S/NapRSOLm/UeOSBdkpe/NQSEOoi6QYDT+fqc0xCYd9F
 tvAjTLwDVpP17AxUHiDQFog/XESiNyhfGTB0Ca4utYfk6dhSyS7M2O4pO7p+RPt+
 /IMoL5KTkuJ3HMyC22M/yCyQrIYrNBc0dKG1JVKCCLQSLAfdtZ7zjQSiwp6uPlL8
 hW9D4k+FKX93jQm4Z0c/JSIC/O04pElILTdf0Oy5ki4QpgJvJpYqMupVaJ/rUyTF
 FsYcmRS3dXfGLbfTJUSnx8P293qiRktO8968RHX4PWRi2sF/L5w=
 =mEAi
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.1-updates-for-linus-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel

gpio: updates for v5.1 - part 2

- gpio-mockup updates improving the user-space testing interface and
  adding line state tracking for correct edge interrupts
- interrupt simulator patch exposing the irq type configuration to
  users
2019-02-20 09:43:45 +01:00
Bartosz Golaszewski
8d91ecc84d irq/irq_sim: add irq_set_type() callback
Implement the irq_set_type() callback and call irqd_set_trigger_type()
internally so that users interested in the configured trigger type can
later retrieve it using irqd_get_trigger_type(). We only support edge
trigger types.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-19 17:42:28 +01:00
Thomas Gleixner
a6a309edba genirq/affinity: Remove the leftovers of the original set support
Now that the NVME driver is converted over to the calc_set() callback, the
workarounds of the original set support can be removed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.689834224@linutronix.de
2019-02-18 11:21:29 +01:00
Ming Lei
c66d4bd110 genirq/affinity: Add new callback for (re)calculating interrupt sets
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources.  This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
  	for nr_sets after the invocation of the callback so it catches
  	broken drivers. Fixed the kernel doc comments for struct
  	irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de
2019-02-18 11:21:28 +01:00
Ming Lei
9cfef55bb5 genirq/affinity: Store interrupt sets size in struct irq_affinity
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one
or more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via
a pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a
loop in the driver to determine the maximum number of interrupts which
are provided by the PCI capabilities and the underlying CPU resources.
This loop would have to be replicated in every driver which wants to
utilize this mechanism. That's unwanted code duplication and error
prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and
their size, in the core code. As the core code does not have any
knowledge about the underlying device, a driver specific callback will
be added to struct affinity_desc, which will be invoked by the core
code. The callback will get the number of available interupts as an
argument, so the driver can calculate the corresponding number and size
of interrupt sets.

To support this, two modifications for the handling of struct irq_affinity
are required:

1) The (optional) interrupt sets size information is contained in a
   separate array of integers and struct irq_affinity contains a
   pointer to it.

   This is cumbersome and as the maximum number of interrupt sets is small,
   there is no reason to have separate storage. Moving the size array into
   struct affinity_desc avoids indirections and makes the code simpler.

2) At the moment the struct irq_affinity pointer which is handed in from
   the driver and passed through to several core functions is marked
   'const'.

   With the upcoming callback to recalculate the number and size of
   interrupt sets, it's necessary to remove the 'const'
   qualifier. Otherwise the callback would not be able to update the data.

Implement #1 and store the interrupt sets size in 'struct irq_affinity'.

No functional change.

[ tglx: Fixed the memcpy() size so it won't copy beyond the size of the
  	source. Fixed the kernel doc comments for struct irq_affinity and
  	de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.423723127@linutronix.de
2019-02-18 11:21:27 +01:00
Thomas Gleixner
0145c30e89 genirq/affinity: Code consolidation
All information and calculations in the interrupt affinity spreading code
is strictly unsigned int. Though the code uses int all over the place.

Convert it over to unsigned int.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.336424556@linutronix.de
2019-02-18 11:21:27 +01:00
Linus Walleij
8fab3d713c gpio updates for v5.1
- support for a new variant of pca953x
 - documentation fix from Wolfram
 - some tegra186 name changes
 - two minor fixes for madera and altera-a10sr
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlxleLcACgkQEacuoBRx
 13I45Q//YMGUYzkMjOL+lp2DYnnVhVNqrF4hoLjinWVrnhZ6gqu88RgV2Cea4Pta
 oxVxnSsE8LK7kY8VZ8tcBmIqLLkQAJdSVtqkeSoZF2vhWBAbE9ZaSOYb17SIkSXK
 Ok16lZgZ+ZWOM5EjEvuRpB/qYGjX2glD5/Y2Kl7+wsX1W6U2pXasP0IjhcvDU8mJ
 NXNgfkr6kluMUqHJyqKo8eT/P3Hdv0CK9GsN2vGyfJenCdTSd7EC6KuhWAivi+fG
 /lf1bVuc2cCiXjxdSOXx+Yz7SjNe56viTaqnn/K6OlfLgErjKnRW+AxPkTZXNtDi
 pfMMpPXiwPcbQR2wrXG/7OMmJ1kUsfWoIUCx5RDwhF1KbEQVqgaSITLylk+4Yp/3
 eM0fYsQ+KvOdAnWKSgfxBhaaiO7z5XDdrnkSHBDoiBrm07BqBgK/v3Rivzf2GMEv
 QvM4OBfThS9I8skV5BaOBRDfHZs4N0EU/vhsW9gt50urtlSM0vSYx6kdMq/8R0k4
 NkJT43u+1vi5koMljBAsZYZiyXOQ2B+PlfpTMfMu+93QH8wlu9mOt1r3YTQyA1Xf
 jiOK8M2yQKP5g7RuPM6MtMsqlZKDM5nAlSf7S280Z3+vBd+LaELbXvT2/JL5ViGU
 hfH/gaNwUGUYd8EsWvfhHVdPAAecDCwxfKyKEnFGhMrtunTgwfI=
 =nV64
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.1-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel

gpio updates for v5.1

- support for a new variant of pca953x
- documentation fix from Wolfram
- some tegra186 name changes
- two minor fixes for madera and altera-a10sr
2019-02-17 21:59:33 +01:00
Thomas Gleixner
d869f86645 Merge branch 'linus' into irq/core
Pick up upstream changes to avoid conflicts for pending patches.
2019-02-14 22:26:50 +01:00
Julien Thierry
a51866946c genirq: Fix wrong name in request_percpu_nmi() description
ready_percpu_nmi() was the previous name of prepare_percpu_nmi(). Update
request_percpu_nmi() comment with the correct function name.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Reported-by: Li Wei <liwei391@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-14 10:13:10 +00:00
Linus Walleij
5aa5bd563c genirq: introduce irq_chip_mask_ack_parent()
The hierarchical irqchip never before ran into a situation
where the parent is not "simple", i.e. does not implement
.irq_ack() and .irq_mask() like most, but the qcom-pm8xxx.c
happens to implement only .irq_mask_ack().

Since we want to make ssbi-gpio a hierarchical child of this
irqchip, it must *also* only implement .irq_mask_ack()
and call down to the parent, and for this we of course
need irq_chip_mask_ack_parent().

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-02-13 09:23:05 +01:00
Brian Masney
b5c231d8c8 genirq: introduce irq_domain_translate_twocell
Add a new function irq_domain_translate_twocell() that is to be used as
the translate function in struct irq_domain_ops for the v2 IRQ API.

This patch also changes irq_domain_xlate_twocell() from the v1 IRQ API
to call irq_domain_translate_twocell() in the v2 IRQ API. This required
changes to of_phandle_args_to_fwspec()'s arguments so that it can be
called from multiple places.

Cc: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-02-13 09:22:05 +01:00
Thomas Gleixner
1136b07289 genirq: Avoid summation loops for /proc/stat
Waiman reported that on large systems with a large amount of interrupts the
readout of /proc/stat takes a long time to sum up the interrupt
statistics. In principle this is not a problem. but for unknown reasons
some enterprise quality software reads /proc/stat with a high frequency.

The reason for this is that interrupt statistics are accounted per cpu. So
the /proc/stat logic has to sum up the interrupt stats for each interrupt.

This can be largely avoided for interrupts which are not marked as
'PER_CPU' interrupts by simply adding a per interrupt summation counter
which is incremented along with the per interrupt per cpu counter.

The PER_CPU interrupts need to avoid that and use only per cpu accounting
because they share the interrupt number and the interrupt descriptor and
concurrent updates would conflict or require unwanted synchronization.

Reported-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Daniel Colascione <dancol@google.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/20190208135020.925487496@linutronix.de


8<-------------

v2: Undo the unintentional layout change of struct irq_desc.

 include/linux/irqdesc.h |    1 +
 kernel/irq/chip.c       |   12 ++++++++++--
 kernel/irq/internals.h  |    8 +++++++-
 kernel/irq/irqdesc.c    |    7 ++++++-
 4 files changed, 24 insertions(+), 4 deletions(-)
2019-02-10 21:34:45 +01:00
Ming Lei
347253c42d genirq/affinity: Move allocation of 'node_to_cpumask' to irq_build_affinity_masks()
'node_to_cpumask' is just one temparay variable for irq_build_affinity_masks(),
so move it into irq_build_affinity_masks().

No functioanl change.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/20190125095347.17950-2-ming.lei@redhat.com
2019-02-10 19:53:55 +01:00
Julien Thierry
6e4933a006 irqdesc: Add domain handler for NMIs
NMI handling code should be executed between calls to nmi_enter and
nmi_exit.

Add a separate domain handler to properly setup NMI context when handling
an interrupt requested as NMI.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-05 14:37:05 +00:00
Julien Thierry
2dcf1fbcad genirq: Provide NMI handlers
Provide flow handlers that are NMI safe for interrupts and percpu_devid
interrupts.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-05 14:37:01 +00:00
Julien Thierry
4b078c3f1a genirq: Provide NMI management for percpu_devid interrupts
Add support for percpu_devid interrupts treated as NMIs.

Percpu_devid NMIs need to be setup/torn down on each CPU they target.

The same restrictions as for global NMIs still apply for percpu_devid NMIs.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-05 14:36:58 +00:00
Julien Thierry
b525903c25 genirq: Provide basic NMI management for interrupt lines
Add functionality to allocate interrupt lines that will deliver IRQs
as Non-Maskable Interrupts. These allocations are only successful if
the irqchip provides the necessary support and allows NMI delivery for the
interrupt line.

Interrupt lines allocated for NMI delivery must be enabled/disabled through
enable_nmi/disable_nmi_nosync to keep their state consistent.

To treat a PERCPU IRQ as NMI, the interrupt must not be shared nor threaded,
the irqchip directly managing the IRQ must be the root irqchip and the
irqchip cannot be behind a slow bus.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-02-05 14:36:57 +00:00
Brian Masney
38f7ae9bdf genirq: export irq_chip_set_wake_parent symbol
Export the irq_chip_set_wake_parent symbol so that drivers with
hierarchical IRQ chips can be built as a module.

Signed-off-by: Brian Masney <masneyb@onstation.org>
Reported-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-02-04 11:02:51 +01:00
Greg Kroah-Hartman
434537bbd5 genirq/debugfs: No need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the return
value.  The function can work or not, but the code logic should never do
something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190122152151.16139-50-gregkh@linuxfoundation.org
2019-01-29 20:04:21 +01:00
Huacai Chen
12fee4cd5b genirq/irqdesc: Fix double increment in alloc_descs()
The recent rework of alloc_descs() introduced a double increment of the
loop counter. As a consequence only every second affinity mask is
validated.

Remove it.

[ tglx: Massaged changelog ]

Fixes: c410abbbac ("genirq/affinity: Add is_managed to struct irq_affinity_desc")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: Huacai Chen <chenhuacai@gmail.com>
Cc: Dou Liyang <douliyangs@gmail.com>
Link: https://lkml.kernel.org/r/1547694009-16261-1-git-send-email-chenhc@lemote.com
2019-01-18 00:43:09 +01:00
Srinivas Ramana
bddda606ec genirq: Make sure the initial affinity is not empty
If all CPUs in the irq_default_affinity mask are offline when an interrupt
is initialized then irq_setup_affinity() can set an empty affinity mask for
a newly allocated interrupt.

Fix this by falling back to cpu_online_mask in case the resulting affinity
mask is zero.

Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.org
2019-01-15 11:23:27 +01:00
Mathieu Malaterre
01cdfa912f genirq: Correctly annotate implicit fall through
There is a plan to build the kernel with -Wimplicit-fallthrough. The
fallthrough in __handle_irq_event_percpu() has a fallthrough annotation
which is followed by an additional comment and is not recognized by GCC.

Separate the 'fall through' and the rest of the comment with a dash so the
regular expression used by GCC matches.

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190114203633.18557-1-malat@debian.org
2019-01-15 10:40:53 +01:00
Mathieu Malaterre
44133f7eae genirq: Annotate implicit fall through
There is a plan to build the kernel with -Wimplicit-fallthrough. The
fallthrough in __irq_set_trigger() lacks an annotation. Add it.

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20190114203154.17125-1-malat@debian.org
2019-01-15 10:40:34 +01:00
Dou Liyang
c410abbbac genirq/affinity: Add is_managed to struct irq_affinity_desc
Devices which use managed interrupts usually have two classes of
interrupts:

  - Interrupts for multiple device queues
  - Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

 default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled. That limitation was reported by Kashyap and Sumit.

Expand struct irq_affinity_desc with a new bit 'is_managed' which is set
for truly managed interrupts (queue interrupts) and cleared for the general
device interrupts.

[ tglx: Simplify code and massage changelog ]

Reported-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reported-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: bhelgaas@google.com
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-3-douliyangs@gmail.com
2018-12-19 11:32:08 +01:00
Dou Liyang
bec04037e4 genirq/core: Introduce struct irq_affinity_desc
The interrupt affinity management uses straight cpumask pointers to convey
the automatically assigned affinity masks for managed interrupts. The core
interrupt descriptor allocation also decides based on the pointer being non
NULL whether an interrupt is managed or not.

Devices which use managed interrupts usually have two classes of
interrupts:

  - Interrupts for multiple device queues
  - Interrupts for general device management

Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.

Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:

 default_irq_affinity = 4..7

So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.

It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled.

To remedy that situation it's required to convey more information than the
cpumasks through various interfaces related to interrupt descriptor
allocation.

Instead of adding yet another argument, create a new data structure
'irq_affinity_desc' which for now just contains the cpumask. This struct
can be expanded to convey auxilliary information in the next step.

No functional change, just preparatory work.

[ tglx: Simplified logic and clarified changelog ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: kashyap.desai@broadcom.com
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: sumit.saxena@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-2-douliyangs@gmail.com
2018-12-19 11:32:08 +01:00
Thomas Gleixner
c2899c3470 genirq/affinity: Remove excess indentation
Plus other coding style issues which stood out while staring at that code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-12-19 11:32:07 +01:00
Thomas Gleixner
ff3730a497 irqchip updates for 4.21
- A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
 - Updates for new (and old) platforms (i.MX8MQ, F1C100s)
 - A number of SPDX cleanups
 - A workaround for a very broken GICv3 implementation
 - A platform-msi fix
 - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlwZI8cVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDyokP+gKoKbZMc1E7dX6WxUrKh2N+fMJF
 uVbuGF2s57CLG955YNuyo8BK4meWJIHGO3JahwE8I/9eu0G7PaudYvpZgP7s/sxD
 XHLWFVHB1mq4lExMcluT0jG4ZpX7EKvYB1KGqgYM1ScOS9Uubb4ZG9T5GPhUT/YM
 w1BAtHaZmCAg8d0wNPUMaAFc9Bd2B9Z1C8nwS+wpdJRxYxE9x8BES42r95rbXCG6
 5Cq2ol/NbF4RbFodel4YdiAIKfrQtXyQ3N3twC5GRXln4XLjUfzs4mA5rxLLoeGZ
 2UGXeIk0GcokSWF/e+0p3tQDWKwdbqoBhbRbqk7u5ZWuEWTRf4Zot3IlCVpJAMM3
 iRw5XChWxovC+/oqgin4sp1gNpSRgf5mMvR1EauR5DTVtwlOjUBKaPEyKLrPITOo
 B42EJugJ94J0YVdT9RUJsOSXIdOiYFE6I9F4i/XioLYq5FItBB56/81ARZgEncpg
 FEdtseCCtRC3WWGzghxZsSzCW3iGi8wdddRdZmOXCNdPtH03TZg0dGPS+KIn8Soh
 eVSGImV/4efN6hh6fSryeR02fYT3DKGgDQUiV4e/1SOSzxy6VjjrOh48tB8qn/M7
 NbFZMqDKnltsXT2C+bh6zjhorbVCkj8AEtx1oF0d7iIyBxor3eHUelTz6VglNlLq
 RFetH+Yjh9nt9ReO
 =1Mk9
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

 - A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
 - Updates for new (and old) platforms (i.MX8MQ, F1C100s)
 - A number of SPDX cleanups
 - A workaround for a very broken GICv3 implementation
 - A platform-msi fix
 - Various cleanups
2018-12-18 18:37:27 +01:00
Ingo Molnar
c5f48c0a7a genirq: Fix various typos in comments
Go over the IRQ subsystem source code (including irqchip drivers) and
fix common typos in comments.

No change in functionality intended.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
2018-12-18 14:22:28 +01:00
Bartosz Golaszewski
06459901d5 irq/irq_sim: Store multiple interrupt offsets in a bitmap
Two threads can try to fire the irq_sim with different offsets and will
end up fighting for the irq_work asignment. Thomas Gleixner suggested a
solution based on a bitfield where we set a bit for every offset
associated with an interrupt that should be fired and then iterate over
all set bits in the interrupt handler.

This is a slightly modified solution using a bitmap so that we don't
impose a limit on the number of interrupts one can allocate with
irq_sim.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-13 09:35:31 +00:00
Long Li
e8da8794a7 genirq/matrix: Improve target CPU selection for managed interrupts.
On large systems with multiple devices of the same class (e.g. NVMe disks,
using managed interrupts), the kernel can affinitize these interrupts to a
small subset of CPUs instead of spreading them out evenly.

irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
of possible target CPUs which has the lowest number of interrupt vectors
allocated.

This is done by searching the CPU with the highest number of available
vectors. While this is correct for non-managed CPUs it can select the wrong
CPU for managed interrupts. Under certain constellations this results in
affinitizing the managed interrupts of several devices to a single CPU in
a set.

The book keeping of available vectors works the following way:

 1) Non-managed interrupts:

    available is decremented when the interrupt is actually requested by
    the device driver and a vector is assigned. It's incremented when the
    interrupt and the vector are freed.

 2) Managed interrupts:

    Managed interrupts guarantee vector reservation when the MSI/MSI-X
    functionality of a device is enabled, which is achieved by reserving
    vectors in the bitmaps of the possible target CPUs. This reservation
    decrements the available count on each possible target CPU.

    When the interrupt is requested by the device driver then a vector is
    allocated from the reserved region. The operation is reversed when the
    interrupt is freed by the device driver. Neither of these operations
    affect the available count.

    The reservation persist up to the point where the MSI/MSI-X
    functionality is disabled and only this operation increments the
    available count again.

For non-managed interrupts the available count is the correct selection
criterion because the guaranteed reservations need to be taken into
account. Using the allocated counter could lead to a failing allocation in
the following situation (total vector space of 10 assumed):

		 CPU0	CPU1
 available:	    2	   0
 allocated:	    5	   3   <--- CPU1 is selected, but available space = 0
 managed reserved:  3	   7

 while available yields the correct result.

For managed interrupts the available count is not the appropriate
selection criterion because as explained above the available count is not
affected by the actual vector allocation.

The following example illustrates that. Total vector space of 10
assumed. The starting point is:

		 CPU0	CPU1
 available:	    5	   4
 allocated:	    2	   3
 managed reserved:  3	   3

 Allocating vectors for three non-managed interrupts will result in
 affinitizing the first two to CPU0 and the third one to CPU1 because the
 available count is adjusted with each allocation:

		  CPU0	CPU1
 available:	     5	   4	<- Select CPU0 for 1st allocation
 --> allocated:	     3	   3

 available:	     4	   4	<- Select CPU0 for 2nd allocation
 --> allocated:	     4	   3

 available:	     3	   4	<- Select CPU1 for 3rd allocation
 --> allocated:	     4	   4

 But the allocation of three managed interrupts starting from the same
 point will affinitize all of them to CPU0 because the available count is
 not affected by the allocation (see above). So the end result is:

		  CPU0	CPU1
 available:	     5	   4
 allocated:	     5	   3

Introduce a "managed_allocated" field in struct cpumap to track the vector
allocation for managed interrupts separately. Use this information to
select the target CPU when a vector is allocated for a managed interrupt,
which results in more evenly distributed vector assignments. The above
example results in the following allocations:

		 CPU0	CPU1
 managed_allocated: 0	   0	<- Select CPU0 for 1st allocation
 --> allocated:	    3	   3

 managed_allocated: 1	   0	<- Select CPU1 for 2nd allocation
 --> allocated:	    3	   4

 managed_allocated: 1	   1	<- Select CPU0 for 3rd allocation
 --> allocated:	    4	   4

The allocation of non-managed interrupts is not affected by this change and
is still evaluating the available count.

The overall distribution of interrupt vectors for both types of interrupts
might still not be perfectly even depending on the number of non-managed
and managed interrupts in a system, but due to the reservation guarantee
for managed interrupts this cannot be avoided.

Expose the new field in debugfs as well.

[ tglx: Clarified the background of the problem in the changelog and
  	described it independent of NVME ]

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Kelley <mikelley@microsoft.com>
Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.com
2018-11-06 23:20:13 +01:00
Jens Axboe
6da4b3ab9a genirq/affinity: Add support for allocating interrupt sets
A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
and have them appropriately affinitized.

Add support for defining a number of sets in the irq_affinity structure, of
varying sizes, and get each set affinitized correctly across the machine.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20181102145951.31979-5-ming.lei@redhat.com
2018-11-05 12:16:27 +01:00
Ming Lei
060746d9e3 genirq/affinity: Pass first vector to __irq_build_affinity_masks()
No functional change.

Prepares for support of allocating and affinitizing sets of interrupts, in
which each set of interrupts needs a full two stage spreading. The first
vector argument is necessary for this so the affinitizing starts from the
first vector of each set.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Link: https://lkml.kernel.org/r/20181102145951.31979-4-ming.lei@redhat.com
2018-11-05 12:16:26 +01:00
Ming Lei
5c903e108d genirq/affinity: Move two stage affinity spreading into a helper function
No functional change. Prepares for supporting allocating and affinitizing
interrupt sets.

[ tglx: Minor changelog tweaks ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Link: https://lkml.kernel.org/r/20181102145951.31979-3-ming.lei@redhat.com
2018-11-05 12:16:26 +01:00
Long Li
b825921990 genirq/affinity: Spread IRQs to all available NUMA nodes
If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts
which are allocated for a device, the interrupt affinity spreading code
fails to spread them across all nodes.

The reason is, that the spreading code starts from node 0 and continues up
to the number of interrupts requested for allocation. This leaves the nodes
past the last interrupt unused.

This results in interrupt concentration on the first nodes which violates
the assumption of the block layer that all nodes are covered evenly. As a
consequence the NUMA nodes above the number of interrupts are all assigned
to hardware queue 0 and therefore NUMA node 0, which results in bad
performance and has CPU hotplug implications, because queue 0 gets shut
down when the last CPU of node 0 is offlined.

Go over all NUMA nodes and assign them round-robin to all requested
interrupts to solve this.

[ tglx: Massaged changelog ]

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Link: https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
2018-11-05 12:16:26 +01:00
Michael Kelley
57f01796f1 irq/matrix: Fix memory overallocation
IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by
the size of a long, yielding a byte count. But it is used to size an array
of longs, which is way more memory than is needed.

Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the
arrays come out the correct size.

Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: KY Srinivasan <kys@microsoft.com>
Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com
2018-11-01 10:00:38 +01:00
Linus Torvalds
5947a64a7e Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The interrupt brigade came up with the following updates:

   - Driver for the Marvell System Error Interrupt machinery

   - Overhaul of the GIC-V3 ITS driver

   - Small updates and fixes all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  genirq: Fix race on spurious interrupt detection
  softirq: Fix typo in __do_softirq() comments
  genirq: Fix grammar s/an /a /
  irqchip/gic: Unify GIC priority definitions
  irqchip/gic-v3: Remove acknowledge loop
  dt-bindings/interrupt-controller: Add documentation for Marvell SEI controller
  dt-bindings/interrupt-controller: Update Marvell ICU bindings
  irqchip/irq-mvebu-icu: Add support for System Error Interrupts (SEI)
  arm64: marvell: Enable SEI driver
  irqchip/irq-mvebu-sei: Add new driver for Marvell SEI
  irqchip/irq-mvebu-icu: Support ICU subnodes
  irqchip/irq-mvebu-icu: Disociate ICU and NSR
  irqchip/irq-mvebu-icu: Clarify the reset operation of configured interrupts
  irqchip/irq-mvebu-icu: Fix wrong private data retrieval
  dt-bindings/interrupt-controller: Fix Marvell ICU length in the example
  genirq/msi: Allow creation of a tree-based irqdomain for platform-msi
  dt-bindings: irqchip: renesas-irqc: Document r8a7744 support
  dt-bindings: irqchip: renesas-irqc: Document R-Car E3 support
  irqchip/pdc: Setup all edge interrupts as rising edge at GIC
  irqchip/gic-v3-its: Allow use of LPI tables in reserved memory
  ...
2018-10-25 11:43:47 -07:00
Lukas Wunner
746a923b86 genirq: Fix race on spurious interrupt detection
Commit 1e77d0a1ed ("genirq: Sanitize spurious interrupt detection of
threaded irqs") made detection of spurious interrupts work for threaded
handlers by:

a) incrementing a counter every time the thread returns IRQ_HANDLED, and
b) checking whether that counter has increased every time the thread is
   woken.

However for oneshot interrupts, the commit unmasks the interrupt before
incrementing the counter.  If another interrupt occurs right after
unmasking but before the counter is incremented, that interrupt is
incorrectly considered spurious:

time
 |  irq_thread()
 |    irq_thread_fn()
 |      action->thread_fn()
 |      irq_finalize_oneshot()
 |        unmask_threaded_irq()            /* interrupt is unmasked */
 |
 |                  /* interrupt fires, incorrectly deemed spurious */
 |
 |    atomic_inc(&desc->threads_handled); /* counter is incremented */
 v

This is observed with a hi3110 CAN controller receiving data at high volume
(from a separate machine sending with "cangen -g 0 -i -x"): The controller
signals a huge number of interrupts (hundreds of millions per day) and
every second there are about a dozen which are deemed spurious.

In theory with high CPU load and the presence of higher priority tasks, the
number of incorrectly detected spurious interrupts might increase beyond
the 99,900 threshold and cause disablement of the interrupt.

In practice it just increments the spurious interrupt count. But that can
cause people to waste time investigating it over and over.

Fix it by moving the accounting before the invocation of
irq_finalize_oneshot().

[ tglx: Folded change log update ]

Fixes: 1e77d0a1ed ("genirq: Sanitize spurious interrupt detection of threaded irqs")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Akshay Bhat <akshay.bhat@timesys.com>
Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com>
Cc: stable@vger.kernel.org # v3.16+
Link: https://lkml.kernel.org/r/1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@wunner.de
2018-10-19 17:31:00 +02:00
Geert Uytterhoeven
b8d62f33b7 genirq: Fix grammar s/an /a /
Fix a grammar mistake in <linux/interrupt.h>.

[ mingo: While at it also fix another similar error in another comment as well. ]

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Jiri Kosina <trivial@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181008111726.26286-1-geert%2Brenesas@glider.be
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09 07:50:41 +02:00
Marc Zyngier
94967b55eb genirq/debugfs: Reinstate full OF path for domain name
On a DT based system, we use the of_node full name to name the
corresponding irq domain. We expect that name to be unique, so so that
domains with the same base name won't clash (this happens on multi-node
topologies, for example).

Since a7e4cfb0a7 ("of/fdt: only store the device node basename in
full_name"), of_node_full_name() lies and only returns the basename. This
breaks the above requirement, and we end-up with only a subset of the
domains in /sys/kernel/debug/irq/domains.

Let's reinstate the feature by using the fancy new %pOF format specifier,
which happens to do the right thing.

Fixes: a7e4cfb0a7 ("of/fdt: only store the device node basename in full_name")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20181001100522.180054-3-marc.zyngier@arm.com
2018-10-01 12:24:53 +02:00
Marc Zyngier
513145ea66 genirq/debugfs: Reset domain debugfs_file on removal of the debugfs file
When removing a debugfs file for a given irq domain, we fail to clear the
corresponding field, meaning that the corresponding domain won't be created
again if we need to do so.

It turns out that this is exactly what irq_domain_update_bus_token does
(delete old file, update domain name, recreate file).

This doesn't have any impact other than making debug more difficult, but we
do value ease of debugging... So clear the debugfs_file field.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20181001100522.180054-2-marc.zyngier@arm.com
2018-10-01 12:24:53 +02:00
Dou Liyang
76f99ae5b5 irq/matrix: Spread managed interrupts on allocation
Linux spreads out the non managed interrupt across the possible target CPUs
to avoid vector space exhaustion.

Managed interrupts are treated differently, as for them the vectors are
reserved (with guarantee) when the interrupt descriptors are initialized.

When the interrupt is requested a real vector is assigned. The assignment
logic uses the first CPU in the affinity mask for assignment. If the
interrupt has more than one CPU in the affinity mask, which happens when a
multi queue device has less queues than CPUs, then doing the same search as
for non managed interrupts makes sense as it puts the interrupt on the
least interrupt plagued CPU. For single CPU affine vectors that's obviously
a NOOP.

Restructre the matrix allocation code so it does the 'best CPU' search, add
the sanity check for an empty affinity mask and adapt the call site in the
x86 vector management code.

[ tglx: Added the empty mask check to the core and improved change log ]

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.com
2018-09-18 18:27:24 +02:00
Dou Liyang
8ffe4e61c0 irq/matrix: Split out the CPU selection code into a helper
Linux finds the CPU which has the lowest vector allocation count to spread
out the non managed interrupts across the possible target CPUs, but does
not do so for managed interrupts.

Split out the CPU selection code into a helper function for reuse. No
functional change.

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180908175838.14450-1-dou_liyang@163.com
2018-09-18 18:27:24 +02:00
Linus Torvalds
d0daaeaf60 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull genirq updates from Thomas Gleixner:
 "The irq departement provides:

   - A synchronization fix for free_irq() to synchronize just the
     removed interrupt thread on shared interrupt lines.

   - Consolidate the multi low level interrupt entry handling and mvoe
     it to the generic code instead of adding yet another copy for
     RISC-V

   - Refactoring of the ARM LPI allocator and LPI exposure to the
     hypervisor

   - Yet another interrupt chip driver for the JZ4725B SoC

   - Speed up for /proc/interrupts as people seem to love reading this
     file with high frequency

   - Miscellaneous fixes and updates"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  irqchip/gic-v3-its: Make its_lock a raw_spin_lock_t
  genirq/irqchip: Remove MULTI_IRQ_HANDLER as it's now obselete
  openrisc: Use the new GENERIC_IRQ_MULTI_HANDLER
  arm64: Use the new GENERIC_IRQ_MULTI_HANDLER
  ARM: Convert to GENERIC_IRQ_MULTI_HANDLER
  irqchip: Port the ARM IRQ drivers to GENERIC_IRQ_MULTI_HANDLER
  irqchip/gic-v3-its: Reduce minimum LPI allocation to 1 for PCI devices
  dt-bindings: irqchip: renesas-irqc: Document r8a77980 support
  dt-bindings: irqchip: renesas-irqc: Document r8a77470 support
  irqchip/ingenic: Add support for the JZ4725B SoC
  irqchip/stm32: Add exti0 translation for stm32mp1
  genirq: Remove redundant NULL pointer check in __free_irq()
  irqchip/gic-v3-its: Honor hypervisor enforced LPI range
  irqchip/gic-v3: Expose GICD_TYPER in the rdist structure
  irqchip/gic-v3-its: Drop chunk allocation compatibility
  irqchip/gic-v3-its: Move minimum LPI requirements to individual busses
  irqchip/gic-v3-its: Use full range of LPIs
  irqchip/gic-v3-its: Refactor LPI allocator
  genirq: Synchronize only with single thread on free_irq()
  genirq: Update code comments wrt recycled thread_mask
  ...
2018-08-13 10:47:26 -07:00
Thomas Gleixner
9e90c79852 irqchip updates for 4.19
- GICv3 ITS LPI allocation revamp
 - GICv3 support for hypervisor-enforced LPI range
 - GICv3 ITS conversion to raw spinlock
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAltoBXMVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDyUYP/1feAq3F7ZmhCIZka4c6y/m4EBpq
 BjWEEgOAGMEyyB4s98flsRtZcEUxxp6CqEXo2FgCsd1Nj+og7oA7vwOlqy3aGzsi
 9f/Z5Wi6SlG06lH5tmYNkyVbGk2tE3s2FzkH5Rg8qZGk+X3OCOdNs/+G20pYAkSp
 ESePWSapbQUJSExJ1MqzfdHFidtVA1V+ev8BKdIp2ykl1NRae8LJeKHIbqac49Ym
 JclfCLFpQM1M1ElB9j0E8hAvZhz10oOz7TtBR737O/1QEifVyFqGBckPzldvwIJM
 zZ+nR+Yzj1ruD109xwaF1iKy9AinZWhiqrtN7UXJ3jwHtNih+sy0R6FQ38GMNoOC
 0K02n/qStR5xglGr4BmAcWlOuFtBYWfz6HpSVMqaTWWmOxHEiqS6pXtEA+dV/YyI
 wHLbo0YzpWTQm6t1+b/PoByAJ0/hOcD1nOD57b+NGjX7tZV0sGjpGsecvFhTSywh
 BN3COBi9k/FOBrOTGDX1qUAI+mEf76vc2BAC+BkkoiiMg3WlY0E9qfQJguUxHdrb
 0LS3lDZoHCNoz8RZLrUyenTT0NYGcjPGUTinMDJWG79VGXOWFexTDdCuX0kF90CK
 1Zie3O6lrTYolmaiyLUxwukKp1SVUyoA5IpKVwfDJQYUhEfk27yvlzg2MBMcHDRA
 uy3QSkmjx9vw/sAu
 =gKw8
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

- GICv3 ITS LPI allocation revamp
- GICv3 support for hypervisor-enforced LPI range
- GICv3 ITS conversion to raw spinlock
2018-08-06 12:45:42 +02:00
Thomas Gleixner
d1f0301b33 genirq: Make force irq threading setup more robust
The support of force threading interrupts which are set up with both a
primary and a threaded handler wreckaged the setup of regular requested
threaded interrupts (primary handler == NULL).

The reason is that it does not check whether the primary handler is set to
the default handler which wakes the handler thread. Instead it replaces the
thread handler with the primary handler as it would do with force threaded
interrupts which have been requested via request_irq(). So both the primary
and the thread handler become the same which then triggers the warnon that
the thread handler tries to wakeup a not configured secondary thread.

Fortunately this only happens when the driver omits the IRQF_ONESHOT flag
when requesting the threaded interrupt, which is normaly caught by the
sanity checks when force irq threading is disabled.

Fix it by skipping the force threading setup when a regular threaded
interrupt is requested. As a consequence the interrupt request which lacks
the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging
it.

Fixes: 2a1d3ab898 ("genirq: Handle force threading of irqs with primary and thread handler")
Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>
Cc: stable@vger.kernel.org
2018-08-03 15:19:01 +02:00
Palmer Dabbelt
4f7799d96e genirq/irqchip: Remove MULTI_IRQ_HANDLER as it's now obselete
Now that every user of MULTI_IRQ_HANDLER has been convereted over to use
GENERIC_IRQ_MULTI_HANDLER remove the references to MULTI_IRQ_HANDLER.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux@armlinux.org.uk
Cc: catalin.marinas@arm.com
Cc: Will Deacon <will.deacon@arm.com>
Cc: jonas@southpole.se
Cc: stefan.kristiansson@saunalahti.fi
Cc: shorne@gmail.com
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: nicolas.pitre@linaro.org
Cc: vladimir.murzin@arm.com
Cc: keescook@chromium.org
Cc: jinb.park7@gmail.com
Cc: yamada.masahiro@socionext.com
Cc: alexandre.belloni@bootlin.com
Cc: pombredanne@nexb.com
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: kstewart@linuxfoundation.org
Cc: jhogan@kernel.org
Cc: mark.rutland@arm.com
Cc: ard.biesheuvel@linaro.org
Cc: james.morse@arm.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: openrisc@lists.librecores.org
Link: https://lkml.kernel.org/r/20180622170126.6308-6-palmer@sifive.com
2018-08-03 12:14:10 +02:00
RAGHU Halharvi
d91cfeb0aa genirq: Remove redundant NULL pointer check in __free_irq()
The NULL pointer check in __free_irq() triggers a 'dereference before NULL
pointer check' warning in static code analysis. It turns out that the check
is redundant because all callers have a NULL pointer check already.

Remove it.

Signed-off-by: RAGHU Halharvi <raghuhack78@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180717102009.7708-1-raghuhack78@gmail.com
2018-07-17 13:35:44 +02:00
Lukas Wunner
519cc8652b genirq: Synchronize only with single thread on free_irq()
When pciehp is converted to threaded IRQ handling, removal of unplugged
devices below a PCIe hotplug port happens synchronously in the IRQ thread.
Removal of devices typically entails a call to free_irq() by their drivers.

If those devices share their IRQ with the hotplug port, __free_irq()
deadlocks because it calls synchronize_irq() to wait for all hard IRQ
handlers as well as all threads sharing the IRQ to finish.

Actually it's sufficient to wait only for the IRQ thread of the removed
device, so call synchronize_hardirq() to wait for all hard IRQ handlers to
finish, but no longer for any threads.  Compensate by rearranging the
control flow in irq_wait_for_interrupt() such that the device's thread is
allowed to run one last time after kthread_stop() has been called.

kthread_stop() blocks until the IRQ thread has completed.  On completion
the IRQ thread clears its oneshot thread_mask bit.  This is safe because
__free_irq() holds the request_mutex, thereby preventing __setup_irq() from
handing out the same oneshot thread_mask bit to a newly requested action.

Stack trace for posterity:
    INFO: task irq/17-pciehp:94 blocked for more than 120 seconds.
    schedule+0x28/0x80
    synchronize_irq+0x6e/0xa0
    __free_irq+0x15a/0x2b0
    free_irq+0x33/0x70
    pciehp_release_ctrl+0x98/0xb0
    pcie_port_remove_service+0x2f/0x40
    device_release_driver_internal+0x157/0x220
    bus_remove_device+0xe2/0x150
    device_del+0x124/0x340
    device_unregister+0x16/0x60
    remove_iter+0x1a/0x20
    device_for_each_child+0x4b/0x90
    pcie_port_device_remove+0x1e/0x30
    pci_device_remove+0x36/0xb0
    device_release_driver_internal+0x157/0x220
    pci_stop_bus_device+0x7d/0xa0
    pci_stop_bus_device+0x3d/0xa0
    pci_stop_and_remove_bus_device+0xe/0x20
    pciehp_unconfigure_device+0xb8/0x160
    pciehp_disable_slot+0x84/0x130
    pciehp_ist+0x158/0x190
    irq_thread_fn+0x1b/0x50
    irq_thread+0x143/0x1a0
    kthread+0x111/0x130

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/d72b41309f077c8d3bee6cc08ad3662d50b5d22a.1529828292.git.lukas@wunner.de
2018-06-24 14:17:27 +02:00
Lukas Wunner
836557bd58 genirq: Update code comments wrt recycled thread_mask
Previously a race existed between __free_irq() and __setup_irq() wherein
the thread_mask of a just removed action could be handed out to a newly
added action and the freed irq thread would then tread on the oneshot
mask bit of the newly added irq thread in irq_finalize_oneshot():

time
 |  __free_irq()
 |    raw_spin_lock_irqsave(&desc->lock, flags);
 |    <remove action from linked list>
 |    raw_spin_unlock_irqrestore(&desc->lock, flags);
 |
 |  __setup_irq()
 |    raw_spin_lock_irqsave(&desc->lock, flags);
 |    <traverse linked list to determine oneshot mask bit>
 |    raw_spin_unlock_irqrestore(&desc->lock, flags);
 |
 |  irq_thread() of freed irq (__free_irq() waits in synchronize_irq())
 |    irq_thread_fn()
 |      irq_finalize_oneshot()
 |        raw_spin_lock_irq(&desc->lock);
 |        desc->threads_oneshot &= ~action->thread_mask;
 |        raw_spin_unlock_irq(&desc->lock);
 v

The race was known at least since 2012 when it was documented in a code
comment by commit e04268b0ef ("genirq: Remove paranoid warnons and bogus
fixups"). The race itself is harmless as nothing touches any of the
potentially freed data after synchronize_irq().

In 2017 the race was close by commit 9114014cf4 ("genirq: Add mutex to
irq desc to serialize request/free_irq()"), apparently inadvertantly so
because the race is neither mentioned in the commit message nor was the
code comment updated.  Make up for that.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: linux-pci@vger.kernel.org
Link: https://lkml.kernel.org/r/32fc25aa35ecef4b2692f57687bb7fc2a57230e2.1529828292.git.lukas@wunner.de
2018-06-24 14:17:26 +02:00
Eric Dumazet
74bdf7815d genirq: Speedup show_interrupts()
Since commit 425a5072dc ("genirq: Free irq_desc with rcu"),
show_interrupts() can be switched to rcu locking, which removes possible
contention on sparse_irq_lock.

The per_cpu count scan and print can be done without holding desc spinlock.

And there is no need to call kstat_irqs_cpu() and abuse irq_to_desc() while
holding rcu read lock, since desc and desc->kstat_irqs wont disappear or
change.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Link: https://lkml.kernel.org/r/20180620150332.163320-1-edumazet@google.com
2018-06-22 14:22:58 +02:00
Marc Zyngier
72a8edc2d9 genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
Debug is missing the IRQCHIP_SUPPORTS_LEVEL_MSI debug entry, making debugfs
slightly less useful.

Take this opportunity to also add a missing comment in the definition of
IRQCHIP_SUPPORTS_LEVEL_MSI.

Fixes: 6988e0e0d2 ("genirq/msi: Limit level-triggered MSI to platform devices")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-2-marc.zyngier@arm.com
2018-06-22 14:22:00 +02:00
Jonathan Neuschäfer
0a13ec0bbc genirq: Fix editing error in a comment
When the comment was reflowed to a wider format, the "*" snuck in.

Fixes: ae88a23b32 ("irq: refactor and clean up the free_irq() code flow")
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180617124018.25539-1-j.neuschaefer@gmx.net
2018-06-19 09:19:41 +02:00
Eric Dumazet
4a5f4d2f89 genirq: Use rcu in kstat_irqs_usr()
Jeremy Dorfman identified mutex contention when multiple threads
parse /proc/stat concurrently.

Since commit 425a5072dc ("genirq: Free irq_desc with rcu"),
kstat_irqs_usr() can be switched to rcu locking, which removes this mutex
contention.

show_interrupts() case will be handled in a separate patch.

Reported-by: Jeremy Dorfman <jdorfman@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Link: https://lkml.kernel.org/r/20180618125612.155057-1-edumazet@google.com
2018-06-19 09:19:40 +02:00
Linus Torvalds
f4e5b30d80 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 updates and fixes from Thomas Gleixner:

 - Fix the (late) fallout from the vector management rework causing
   hlist corruption and irq descriptor reference leaks caused by a
   missing sanity check.

   The straight forward fix triggered another long standing issue to
   surface. The pre rework code hid the issue due to being way slower,
   but now the chance that user space sees an EBUSY error return when
   updating irq affinities is way higher, though quite a bunch of
   userspace tools do not handle it properly despite the fact that EBUSY
   could be returned for at least 10 years.

   It turned out that the EBUSY return can be avoided completely by
   utilizing the existing delayed affinity update mechanism for irq
   remapped scenarios as well. That's a bit more error handling in the
   kernel, but avoids fruitless fingerpointing discussions with tool
   developers.

 - Decouple PHYSICAL_MASK from AMD SME as its going to be required for
   the upcoming Intel memory encryption support as well.

 - Handle legacy device ACPI detection properly for newer platforms

 - Fix the wrong argument ordering in the vector allocation tracepoint

 - Simplify the IDT setup code for the APIC=n case

 - Use the proper string helpers in the MTRR code

 - Remove a stale unused VDSO source file

 - Convert the microcode update lock to a raw spinlock as its used in
   atomic context.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
  x86/apic/vector: Print APIC control bits in debugfs
  genirq/affinity: Defer affinity setting if irq chip is busy
  x86/platform/uv: Use apic_ack_irq()
  x86/ioapic: Use apic_ack_irq()
  irq_remapping: Use apic_ack_irq()
  x86/apic: Provide apic_ack_irq()
  genirq/migration: Avoid out of line call if pending is not set
  genirq/generic_pending: Do not lose pending affinity update
  x86/apic/vector: Prevent hlist corruption and leaks
  x86/vector: Fix the args of vector_alloc tracepoint
  x86/idt: Simplify the idt_setup_apic_and_irq_gates()
  x86/platform/uv: Remove extra parentheses
  x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME
  x86: Mark native_set_p4d() as __always_inline
  x86/microcode: Make the late update update_lock a raw lock for RT
  x86/mtrr: Convert to use strncpy_from_user() helper
  x86/mtrr: Convert to use match_string() helper
  x86/vdso: Remove unused file
  x86/i8237: Register device based on FADT legacy boot flag
2018-06-10 09:44:53 -07:00
Thomas Gleixner
12f47073a4 genirq/affinity: Defer affinity setting if irq chip is busy
The case that interrupt affinity setting fails with -EBUSY can be handled
in the kernel completely by using the already available generic pending
infrastructure.

If a irq_chip::set_affinity() fails with -EBUSY, handle it like the
interrupts for which irq_chip::set_affinity() can only be invoked from
interrupt context. Copy the new affinity mask to irq_desc::pending_mask and
set the affinity pending bit. The next raised interrupt for the affected
irq will check the pending bit and try to set the new affinity from the
handler. This avoids that -EBUSY is returned when an affinity change is
requested from user space and the previous change has not been cleaned
up. The new affinity will take effect when the next interrupt is raised
from the device.

Fixes: dccfe3147b ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Link: https://lkml.kernel.org/r/20180604162224.819273597@linutronix.de
2018-06-06 15:18:22 +02:00
Thomas Gleixner
d340ebd696 genirq/migration: Avoid out of line call if pending is not set
The upcoming fix for the -EBUSY return from affinity settings requires to
use the irq_move_irq() functionality even on irq remapped interrupts. To
avoid the out of line call, move the check for the pending bit into an
inline helper.

Preparatory change for the real fix. No functional change.

Fixes: dccfe3147b ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de
2018-06-06 15:18:20 +02:00
Thomas Gleixner
a33a5d2d16 genirq/generic_pending: Do not lose pending affinity update
The generic pending interrupt mechanism moves interrupts from the interrupt
handler on the original target CPU to the new destination CPU. This is
required for x86 and ia64 due to the way the interrupt delivery and
acknowledge works if the interrupts are not remapped.

However that update can fail for various reasons. Some of them are valid
reasons to discard the pending update, but the case, when the previous move
has not been fully cleaned up is not a legit reason to fail.

Check the return value of irq_do_set_affinity() for -EBUSY, which indicates
a pending cleanup, and rearm the pending move in the irq dexcriptor so it's
tried again when the next interrupt arrives.

Fixes: 996c591227 ("x86/irq: Plug vector cleanup race")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Link: https://lkml.kernel.org/r/20180604162224.386544292@linutronix.de
2018-06-06 15:18:19 +02:00
Sebastian Andrzej Siewior
47b82e8818 ide: don't enable/disable interrupts in force threaded-IRQ mode
The interrupts are enabled/disabled so the interrupt handler can run
with enabled interrupts while serving the interrupt and not lose other
interrupts especially the timer tick.
If the system runs with force-threaded interrupts then there is no need
to enable the interrupts.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-05 16:26:47 -04:00
Linus Torvalds
db020be9f7 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:

 - Consolidation of softirq pending:

   The softirq mask and its accessors/mutators have many implementations
   scattered around many architectures. Most do the same things
   consisting in a field in a per-cpu struct (often irq_cpustat_t)
   accessed through per-cpu ops. We can provide instead a generic
   efficient version that most of them can use. In fact s390 is the only
   exception because the field is stored in lowcore.

 - Support for level!?! triggered MSI (ARM)

   Over the past couple of years, we've seen some SoCs coming up with
   ways of signalling level interrupts using a new flavor of MSIs, where
   the MSI controller uses two distinct messages: one that raises a
   virtual line, and one that lowers it. The target MSI controller is in
   charge of maintaining the state of the line.

   This allows for a much simplified HW signal routing (no need to have
   hundreds of discrete lines to signal level interrupts if you already
   have a memory bus), but results in a departure from the current idea
   the kernel has of MSIs.

 - Support for Meson-AXG GPIO irqchip

 - Large stm32 irqchip rework (suspend/resume, hierarchical domains)

 - More SPDX conversions

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  ARM: dts: stm32: Add exti support to stm32mp157 pinctrl
  ARM: dts: stm32: Add exti support for stm32mp157c
  pinctrl/stm32: Add irq_eoi for stm32gpio irqchip
  irqchip/stm32: Add suspend/resume support for hierarchy domain
  irqchip/stm32: Add stm32mp1 support with hierarchy domain
  irqchip/stm32: Prepare common functions
  irqchip/stm32: Add host and driver data structures
  irqchip/stm32: Add suspend support
  irqchip/stm32: Add falling pending register support
  irqchip/stm32: Checkpatch fix
  irqchip/stm32: Optimizes and cleans up stm32-exti irq_domain
  irqchip/meson-gpio: Add support for Meson-AXG SoCs
  dt-bindings: interrupt-controller: New binding for Meson-AXG SoC
  dt-bindings: interrupt-controller: Fix the double quotes
  softirq/s390: Move default mutators of overwritten softirq mask to s390
  softirq/x86: Switch to generic local_softirq_pending() implementation
  softirq/sparc: Switch to generic local_softirq_pending() implementation
  softirq/powerpc: Switch to generic local_softirq_pending() implementation
  softirq/parisc: Switch to generic local_softirq_pending() implementation
  softirq/ia64: Switch to generic local_softirq_pending() implementation
  ...
2018-06-04 19:59:22 -07:00
Christoph Hellwig
3f3942aca6 proc: introduce proc_create_single{,_data}
Variants of proc_create{,_data} that directly take a seq_file show
callback and drastically reduces the boilerplate code in the callers.

All trivial callers converted over.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16 07:23:35 +02:00
Marc Zyngier
0be8153cbc genirq/msi: Allow level-triggered MSIs to be exposed by MSI providers
So far, MSIs have been used to signal edge-triggered interrupts, as
a write is a good model for an edge (you can't "unwrite" something).
On the other hand, routing zillions of wires in an SoC because you
need level interrupts is a bit extreme.

People have come up with a variety of schemes to support this, which
involves sending two messages: one to signal the interrupt, and one
to clear it. Since the kernel cannot represent this, we've ended up
with side-band mechanisms that are pretty awful.

Instead, let's acknoledge the requirement, and ensure that, under the
right circumstances, the irq_compose_msg and irq_write_msg can take
as a parameter an array of two messages instead of a pointer to a
single one. We also add some checking that the compose method only
clobbers the second message if the MSI domain has been created with
the MSI_FLAG_LEVEL_CAPABLE flags.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-2-marc.zyngier@arm.com
2018-05-13 15:58:59 +02:00
Bartosz Golaszewski
275220ca6e genirq/irq_sim: Remove the license boilerplate
There is the SPDX license identifier now in the irq simulator. Remove the
license boilerplate.

While at it: update the copyright notice, since I did some changes in 2018.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180426200747.8344-1-brgl@bgdev.pl
2018-04-26 22:26:39 +02:00
Ming Lei
d3056812e7 genirq/affinity: Spread irq vectors among present CPUs as far as possible
Commit 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
tried to spread the interrupts accross all possible CPUs to make sure that
in case of phsyical hotplug (e.g. virtualization) the CPUs which get
plugged in after the device was initialized are targeted by a hardware
queue and the corresponding interrupt.

This has a downside in cases where the ACPI tables claim that there are
more possible CPUs than present CPUs and the number of interrupts to spread
out is smaller than the number of possible CPUs. These bogus ACPI tables
are unfortunately not uncommon.

In such a case the vector spreading algorithm assigns interrupts to CPUs
which can never be utilized and as a consequence these interrupts are
unused instead of being mapped to present CPUs. As a result the performance
of the device is suboptimal.

To fix this spread the interrupt vectors in two stages:

 1) Spread as many interrupts as possible among the present CPUs

 2) Spread the remaining vectors among non present CPUs

On a 8 core system, where CPU 0-3 are present and CPU 4-7 are not present,
for a device with 4 queues the resulting interrupt affinity is:

  1) Before 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
	irq 39, cpu list 0
	irq 40, cpu list 1
	irq 41, cpu list 2
	irq 42, cpu list 3

  2) With 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs")
	irq 39, cpu list 0-2
	irq 40, cpu list 3-4,6
	irq 41, cpu list 5
	irq 42, cpu list 7

  3) With the refined vector spread applied:
	irq 39, cpu list 0,4
	irq 40, cpu list 1,6
	irq 41, cpu list 2,5
	irq 42, cpu list 3,7

On a 8 core system, where all CPUs are present the resulting interrupt
affinity for the 4 queues is:

	irq 39, cpu list 0,1
	irq 40, cpu list 2,3
	irq 41, cpu list 4,5
	irq 42, cpu list 6,7

This is independent of the number of CPUs which are online at the point of
initialization because in such a system the offline CPUs can be easily
onlined afterwards, while in non-present CPUs need to be plugged physically
or virtually which requires external interaction.

The downside of this approach is that in case of physical hotplug the
interrupt vector spreading might be suboptimal when CPUs 4-7 are physically
plugged. Suboptimal from a NUMA point of view and due to the single target
nature of interrupt affinities the later plugged CPUs might not be targeted
by interrupts at all.

Though, physical hotplug systems are not the common case while the broken
ACPI table disease is wide spread. So it's preferred to have as many
interrupts as possible utilized at the point where the device is
initialized.

Block multi-queue devices like NVME create a hardware queue per possible
CPU, so the goal of commit 84676c1f21 to assign one interrupt vector per
possible CPU is still achieved even with physical/virtual hotplug.

[ tglx: Changed from online to present CPUs for the first spreading stage,
  	renamed variables for readability sake, added comments and massaged
  	changelog ]

Reported-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20180308105358.1506-5-ming.lei@redhat.com
2018-04-06 12:19:51 +02:00
Ming Lei
1a2d0914e2 genirq/affinity: Allow irq spreading from a given starting point
To support two stage irq vector spreading, it's required to add a starting
point to the spreading function. No functional change, just preparatory
work for the actual two stage change.

[ tglx: Renamed variables, tidied up the code and massaged changelog ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20180308105358.1506-4-ming.lei@redhat.com
2018-04-06 12:19:51 +02:00
Ming Lei
b3e6aaa8d9 genirq/affinity: Move actual irq vector spreading into a helper function
No functional change, just prepare for converting to 2-stage irq vector
spreading.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20180308105358.1506-3-ming.lei@redhat.com
2018-04-06 12:19:51 +02:00
Ming Lei
47778f33dc genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
The following patches will introduce two stage irq spreading for improving
irq spread on all possible CPUs.

No functional change.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Link: https://lkml.kernel.org/r/20180308105358.1506-2-ming.lei@redhat.com
2018-04-06 12:19:50 +02:00
Thomas Gleixner
0211e12dd0 genirq/affinity: Don't return with empty affinity masks on error
When the allocation of node_to_possible_cpumask fails, then
irq_create_affinity_masks() returns with a pointer to the empty affinity
masks array, which will cause malfunction.

Reorder the allocations so the masks array allocation comes last and every
failure path returns NULL.

Fixes: 9a0ef98e18 ("genirq/affinity: Assign vectors to all present CPUs")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
2018-04-06 12:19:50 +02:00
Palmer Dabbelt
d6f73825dc genirq: Make GENERIC_IRQ_MULTI_HANDLER depend on !MULTI_IRQ_HANDLER
These config switches enable the same code in the core and the not yet
converted architecture code. They can be selected both by randconfig builds
and cause linker error because the same symbols are defined twice.

Make the new GENERIC_IRQ_MULTI_HANDLER depend on !MULTI_IRQ_HANDLER to
prevent that. The dependency will be removed once all architectures are
converted over.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20180404043130.31277-4-palmer@sifive.com
2018-04-04 12:04:28 +02:00
Thomas Gleixner
f3f59fbc54 genirq: Remove license boilerplate/references
Now that SPDX identifiers are in place, remove the boilerplate or
references.

The change in timings.c has been acked by the author.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Link: https://lkml.kernel.org/r/20180314212030.668321222@linutronix.de
2018-03-20 14:23:28 +01:00
Thomas Gleixner
52a65ff560 genirq: Add missing SPDX identifiers
Add SPDX identifiers to files

 - which contain an explicit license boiler plate or reference

 - which do not contain a license reference and were not updated in the
   initial SPDX conversion because the license was deduced by the scanners
   via EXPORT_SYMBOL_GPL as GPL2.0 only.

[ tglx: Moved adding identifiers from the patch which removes the
  	references/boilerplate ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Link: https://lkml.kernel.org/r/20180314212030.668321222@linutronix.de
2018-03-20 14:23:28 +01:00
Thomas Gleixner
90cafdd521 genirq/matrix: Cleanup SPDX identifier
Use the proper SPDX-Identifier format.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Link: https://lkml.kernel.org/r/20180314212030.492674761@linutronix.de
2018-03-20 14:23:28 +01:00
Thomas Gleixner
99bfce5db9 genirq: Cleanup top of file comments
Remove pointless references to the file name itself and condense the
information so it wastes less space.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Link: https://lkml.kernel.org/r/20180314212030.412095827@linutronix.de
2018-03-20 14:23:27 +01:00
Uwe Kleine König
83ac4ca943 genirq: Pass desc to __irq_free instead of irq number
Given that irq_to_desc() is a radix_tree_lookup and the reverse
operation is only a pointer dereference and that all callers of
__free_irq already have the desc, pass the desc instead of the irq
number.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel@pengutronix.de
Link: https://lkml.kernel.org/r/20180319105202.9794-1-u.kleine-koenig@pengutronix.de
2018-03-20 08:52:44 +01:00
Palmer Dabbelt
caacdbf4aa genirq: Add CONFIG_GENERIC_IRQ_MULTI_HANDLER
The arm multi irq handler registration mechanism has been copied into a
handful of architectures, including arm64 and openrisc. RISC-V needs the
same mechanism.

Instead of adding yet another copy for RISC-V copy the arm implementation
into the core code depending on a new Kconfig symbol:
CONFIG_GENERIC_MULTI_IRQ_HANDLER.

Subsequent patches will convert the various architectures.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: jonas@southpole.se
Cc: catalin.marinas@arm.com
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux@armlinux.org.uk
Cc: stefan.kristiansson@saunalahti.fi
Cc: openrisc@lists.librecores.org
Cc: shorne@gmail.com
Cc: linux-riscv@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lkml.kernel.org/r/20180307235731.22627-2-palmer@sifive.com
2018-03-14 21:46:29 +01:00
Thomas Gleixner
b0d8bef8ed Merge branch 'linus' into irq/core to pick up dependencies. 2018-03-14 20:37:31 +01:00
Bartosz Golaszewski
f09777fa89 genirq/irq_sim: Return the base of the irq range from irq_sim_init()
Returning the base of the allocated interrupt range from irq_sim_init() and
devm_irq_sim_init() allows users to handle the logic of associating irq
numbers with any other driver-specific resources without having to use
irq_sim_irqnum().

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20180304121018.640-4-brgl@bgdev.pl
2018-03-09 11:44:03 +01:00
Bartosz Golaszewski
28b6afa7d4 genirq/irq_sim: Check the return value of irq_sim_init() for error codes
As discussed with Marc Zyngier: irq_sim_init() and its devres variant
should return the base of the allocated interrupt range on success rather
than 0.

Make devm_irq_sim_init() check for an error code. This is a preparatory
change for modifying irq_sim_init() itself.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20180304121018.640-3-brgl@bgdev.pl
2018-03-09 11:44:03 +01:00
Bartosz Golaszewski
34a866bd45 genirq/irq_sim: Explicitly include slab.h
kfree() is used in the irq_sim code but slab.h is pulled in indirectly via
irq.h. Include it explicitly.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20180304121018.640-2-brgl@bgdev.pl
2018-03-09 11:44:03 +01:00
Andy Shevchenko
d61e2944b6 genirq: Add wakeup sysfs node to show IRQ wakeup state
Surprisingly there is no simple way to see if the IRQ line in question
is wakeup source or not.

Note that wakeup might be an OOB (out-of-band) source like GPIO line
which makes things slightly more complicated.

Add a sysfs node to cover this case.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tony Lindgren <tony@atomide.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>
Link: https://lkml.kernel.org/r/20180226155043.67937-1-andriy.shevchenko@linux.intel.com
2018-02-28 18:07:20 +01:00
Thomas Gleixner
651ca2c004 genirq/matrix: Handle CPU offlining proper
At CPU hotunplug the corresponding per cpu matrix allocator is shut down and
the allocated interrupt bits are discarded under the assumption that all
allocated bits have been either migrated away or shut down through the
managed interrupts mechanism.

This is not true because interrupts which are not started up might have a
vector allocated on the outgoing CPU. When the interrupt is started up
later or completely shutdown and freed then the allocated vector is handed
back, triggering warnings or causing accounting issues which result in
suspend failures and other issues.

Change the CPU hotplug mechanism of the matrix allocator so that the
remaining allocations at unplug time are preserved and global accounting at
hotplug is correctly readjusted to take the dormant vectors into account.

Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Reported-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180222112316.849980972@linutronix.de
2018-02-22 22:05:43 +01:00
Thomas Gleixner
cbf8699996 genirq: Let irq thread follow the effective hard irq affinity
In case of threaded interrupts the thread follows the affinity setting of
the hard interrupt. The related function uses the affinity mask which was
set by either from user space or via one of the kernel mechanisms. This
mask can be wider than the resulting effective affinity of the hard
interrupt. As a consequence the thread might become affine to a completely
different CPU.

Use the effective interrupt affinity if the architecture supports it, so
the hard interrupt and the thread stay on the same CPU.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2018-02-16 15:28:50 +01:00
Andy Shevchenko
0b24a0bbe2 irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
...instead of open coding file operations followed by custom ->open()
callbacks per each attribute.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-02-16 14:22:34 +00:00
Sergey Senozhatsky
22ad305741 genirq: remove unneeded kallsyms include
The file was converted from print_symbol() to %pf some time ago in
commit ef26f20cd1 ("genirq: Print threaded handler in spurious debug
output").  kallsyms does not seem to be needed anymore.

Link: http://lkml.kernel.org/r/20171208025616.16267-10-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06 18:32:47 -08:00
Linus Torvalds
f74a127f66 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "Two small changes:

   - a fix for a interrupt regression caused by the vector management
     changes in 4.15 affecting museum pieces which rely on interrupt
     probing for legacy (e.g. parallel port) devices.

     One of the startup calls in the autoprobe code was not changed to
     the new activate_and_startup() function resulting in a warning and
     as a consequence failing to discover the device interrupt.

   - a trivial update to the copyright/license header of the STM32 irq
     chip driver"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Make legacy autoprobing work again
  irqchip/stm32: Fix copyright
2018-02-04 11:41:31 -08:00
Linus Torvalds
ab486bc9a5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:

 - Add a console_msg_format command line option:

     The value "default" keeps the old "[time stamp] text\n" format. The
     value "syslog" allows to see the syslog-like "<log
     level>[timestamp] text" format.

     This feature was requested by people doing regression tests, for
     example, 0day robot. They want to have both filtered and full logs
     at hands.

 - Reduce the risk of softlockup:

     Pass the console owner in a busy loop.

     This is a new approach to the old problem. It was first proposed by
     Steven Rostedt on Kernel Summit 2017. It marks a context in which
     the console_lock owner calls console drivers and could not sleep.
     On the other side, printk() callers could detect this state and use
     a busy wait instead of a simple console_trylock(). Finally, the
     console_lock owner checks if there is a busy waiter at the end of
     the special context and eventually passes the console_lock to the
     waiter.

     The hand-off works surprisingly well and helps in many situations.
     Well, there is still a possibility of the softlockup, for example,
     when the flood of messages stops and the last owner still has too
     much to flush.

     There is increasing number of people having problems with
     printk-related softlockups. We might eventually need to get better
     solution. Anyway, this looks like a good start and promising
     direction.

 - Do not allow to schedule in console_unlock() called from printk():

     This reverts an older controversial commit. The reschedule helped
     to avoid softlockups. But it also slowed down the console output.
     This patch is obsoleted by the new console waiter logic described
     above. In fact, the reschedule made the hand-off less effective.

 - Deprecate "%pf" and "%pF" format specifier:

     It was needed on ia64, ppc64 and parisc64 to dereference function
     descriptors and show the real function address. It is done
     transparently by "%ps" and "pS" format specifier now.

     Sergey Senozhatsky found that all the function descriptors were in
     a special elf section and could be easily detected.

 - Remove printk_symbol() API:

     It has been obsoleted by "%pS" format specifier, and this change
     helped to remove few continuous lines and a less intuitive old API.

 - Remove redundant memsets:

     Sergey removed unnecessary memset when processing printk.devkmsg
     command line option.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
  printk: drop redundant devkmsg_log_str memsets
  printk: Never set console_may_schedule in console_trylock()
  printk: Hide console waiter logic into helpers
  printk: Add console owner and waiter logic to load balance console writes
  kallsyms: remove print_symbol() function
  checkpatch: add pF/pf deprecation warning
  symbol lookup: introduce dereference_symbol_descriptor()
  parisc64: Add .opd based function descriptor dereference
  powerpc64: Add .opd based function descriptor dereference
  ia64: Add .opd based function descriptor dereference
  sections: split dereference_function_descriptor()
  openrisc: Fix conflicting types for _exext and _stext
  lib: do not use print_symbol()
  irq debug: do not use print_symbol()
  sysfs: do not use print_symbol()
  drivers: do not use print_symbol()
  x86: do not use print_symbol()
  unicore32: do not use print_symbol()
  sh: do not use print_symbol()
  mn10300: do not use print_symbol()
  ...
2018-02-01 13:36:15 -08:00
Thomas Gleixner
1beaeacdc8 genirq: Make legacy autoprobing work again
Meelis reported the following warning on a quad P3 HP NetServer museum piece:

WARNING: CPU: 3 PID: 258 at kernel/irq/chip.c:244 __irq_startup+0x80/0x100
EIP: __irq_startup+0x80/0x100
irq_startup+0x7e/0x170
probe_irq_on+0x128/0x2b0
parport_irq_probe.constprop.18+0x8d/0x1af [parport_pc]
parport_pc_probe_port+0xf11/0x1260 [parport_pc]
parport_pc_init+0x78a/0xf10 [parport_pc]
parport_parse_param.constprop.16+0xf0/0xf0 [parport_pc]
do_one_initcall+0x45/0x1e0

This is caused by the rewrite of the irq activation/startup sequence which
missed to convert a callsite in the irq legacy auto probing code.

To fix this irq_activate_and_startup() needs to gain a return value so the
pending logic can work proper.

Fixes: c942cee46b ("genirq: Separate activation and startup")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801301935410.1797@nanos
2018-02-01 11:09:40 +01:00
Linus Torvalds
7bcd342594 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "A rather small set of irq updates this time:

   - removal of the old and now obsolete irq domain debugging code

   - the new Goldfish PIC driver

   - the usual pile of small fixes and updates"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUG
  irq/work: Improve the flag definitions
  irqchip/gic-v3: Fix the driver probe() fail due to disabled GICC entry
  irqchip/irq-goldfish-pic: Add Goldfish PIC driver
  dt-bindings/goldfish-pic: Add device tree binding for Goldfish PIC driver
  irqchip/ompic: fix return value check in ompic_of_init()
  dt-bindings/bcm283x: Define polarity of per-cpu interrupts
  irqchip/irq-bcm2836: Add support for DT interrupt polarity
  dt-bindings/bcm2836-l1-intc: Add interrupt polarity support
2018-01-29 16:47:21 -08:00
Linus Torvalds
0a4b6e2f80 Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
 "This is the main pull request for block IO related changes for the
  4.16 kernel. Nothing major in this pull request, but a good amount of
  improvements and fixes all over the map. This contains:

   - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
     Paolo.

   - Support for SMR zones for deadline and mq-deadline from Damien and
     Christoph.

   - Set of fixes for bcache by way of Michael Lyle, including fixes
     from himself, Kent, Rui, Tang, and Coly.

   - Series from Matias for lightnvm with fixes from Hans Holmberg,
     Javier, and Matias. Mostly centered around pblk, and the removing
     rrpc 1.2 in preparation for supporting 2.0.

   - A couple of NVMe pull requests from Christoph. Nothing major in
     here, just fixes and cleanups, and support for command tracing from
     Johannes.

   - Support for blk-throttle for tracking reads and writes separately.
     From Joseph Qi. A few cleanups/fixes also for blk-throttle from
     Weiping.

   - Series from Mike Snitzer that enables dm to register its queue more
     logically, something that's alwways been problematic on dm since
     it's a stacked device.

   - Series from Ming cleaning up some of the bio accessor use, in
     preparation for supporting multipage bvecs.

   - Various fixes from Ming closing up holes around queue mapping and
     quiescing.

   - BSD partition fix from Richard Narron, fixing a problem where we
     can't mount newer (10/11) FreeBSD partitions.

   - Series from Tejun reworking blk-mq timeout handling. The previous
     scheme relied on atomic bits, but it had races where we would think
     a request had timed out if it to reused at the wrong time.

   - null_blk now supports faking timeouts, to enable us to better
     exercise and test that functionality separately. From me.

   - Kill the separate atomic poll bit in the request struct. After
     this, we don't use the atomic bits on blk-mq anymore at all. From
     me.

   - sgl_alloc/free helpers from Bart.

   - Heavily contended tag case scalability improvement from me.

   - Various little fixes and cleanups from Arnd, Bart, Corentin,
     Douglas, Eryu, Goldwyn, and myself"

* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
  block: remove smart1,2.h
  nvme: add tracepoint for nvme_complete_rq
  nvme: add tracepoint for nvme_setup_cmd
  nvme-pci: introduce RECONNECTING state to mark initializing procedure
  nvme-rdma: remove redundant boolean for inline_data
  nvme: don't free uuid pointer before printing it
  nvme-pci: Suspend queues after deleting them
  bsg: use pr_debug instead of hand crafted macros
  blk-mq-debugfs: don't allow write on attributes with seq_operations set
  nvme-pci: Fix queue double allocations
  block: Set BIO_TRACE_COMPLETION on new bio during split
  blk-throttle: use queue_is_rq_based
  block: Remove kblockd_schedule_delayed_work{,_on}()
  blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
  blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
  lib/scatterlist: Fix chaining support in sgl_alloc_order()
  blk-throttle: track read and write request individually
  block: add bdev_read_only() checks to common helpers
  block: fail op_is_write() requests to read-only partitions
  blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
  ...
2018-01-29 11:51:49 -08:00
Marc Zyngier
c5baa1be8f irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUG
CONFIG_IRQ_DOMAIN_DEBUG is similar to CONFIG_GENERIC_IRQ_DEBUGFS,
just with less information.

Spring cleanup time.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yang Shunyong <shunyong.yang@hxt-semitech.com>
Link: https://lkml.kernel.org/r/20180117142647.23622-1-marc.zyngier@arm.com
2018-01-24 12:32:58 +01:00
Thomas Gleixner
a0c9259dc4 irq/matrix: Spread interrupts on allocation
Keith reported an issue with vector space exhaustion on a server machine
which is caused by the i40e driver allocating 168 MSI interrupts when the
driver is initialized, even when most of these interrupts are not used at
all.

The x86 vector allocation code tries to avoid the immediate allocation with
the reservation mode, but the card uses MSI and does not support MSI entry
masking, which prevents reservation mode and requires immediate vector
allocation.

The matrix allocator is a bit naive and prefers the first CPU in the
cpumask which describes the possible target CPUs for an allocation. That
results in allocating all 168 vectors on CPU0 which later causes vector
space exhaustion when the NVMe driver tries to allocate managed interrupts
on each CPU for the per CPU queues.

Avoid this by finding the CPU which has the lowest vector allocation count
to spread out the non managed interrupt accross the possible target CPUs.

Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Reported-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Keith Busch <keith.busch@intel.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801171557330.1777@nanos
2018-01-18 11:38:41 +01:00
Christoph Hellwig
84676c1f21 genirq/affinity: assign vectors to all possible CPUs
Currently we assign managed interrupt vectors to all present CPUs.  This
works fine for systems were we only online/offline CPUs.  But in case of
systems that support physical CPU hotplug (or the virtualized version of
it) this means the additional CPUs covered for in the ACPI tables or on
the command line are not catered for.  To fix this we'd either need to
introduce new hotplug CPU states just for this case, or we can start
assining vectors to possible but not present CPUs.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Fixes: 4b855ad371 ("blk-mq: Create hctx for each present CPU")
Cc: linux-kernel@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-12 11:01:38 -07:00
Sergey Senozhatsky
3b14f08d16 irq debug: do not use print_symbol()
print_symbol() is a very old API that has been obsoleted by %pS format
specifier in a normal printk() call.

Replace print_symbol() with a direct printk("%pS") call and avoid
using continuous lines.

Link: http://lkml.kernel.org/r/20171212073453.21455-1-sergey.senozhatsky@gmail.com
To: Andrew Morton <akpm@linux-foundation.org>
To: Russell King <linux@armlinux.org.uk>
To: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Salter <msalter@redhat.com>
To: Tony Luck <tony.luck@intel.com>
To: David Howells <dhowells@redhat.com>
To: Yoshinori Sato <ysato@users.sourceforge.jp>
To: Guan Xuetao <gxt@mprc.pku.edu.cn>
To: Borislav Petkov <bp@alien8.de>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Thomas Gleixner <tglx@linutronix.de>
To: Peter Zijlstra <peterz@infradead.org>
To: Vineet Gupta <vgupta@synopsys.com>
To: Fengguang Wu <fengguang.wu@intel.com>
To: David Laight <David.Laight@ACULAB.COM>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-sh@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: updated commit message]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2018-01-05 15:23:59 +01:00
Thomas Gleixner
bc976233a8 genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
The new reservation mode for interrupts assigns a dummy vector when the
interrupt is allocated and assigns a real vector when the interrupt is
requested. The reservation mode prevents vector pressure when devices with
a large amount of queues/interrupts are initialized, but only a minimal
subset of those queues/interrupts is actually used.

This mode has an issue with MSI interrupts which cannot be masked. If the
driver is not careful or the hardware emits an interrupt before the device
irq is requestd by the driver then the interrupt ends up on the dummy
vector as a spurious interrupt which can cause malfunction of the device or
in the worst case a lockup of the machine.

Change the logic for the reservation mode so that the early activation of
MSI interrupts checks whether:

 - the device is a PCI/MSI device
 - the reservation mode of the underlying irqdomain is activated
 - PCI/MSI masking is globally enabled
 - the PCI/MSI device uses either MSI-X, which supports masking, or
   MSI with the maskbit supported.

If one of those conditions is false, then clear the reservation mode flag
in the irq data of the interrupt and invoke irq_domain_activate_irq() with
the reserve argument cleared. In the x86 vector code, clear the can_reserve
flag in the vector allocation data so a subsequent free_irq() won't create
the same situation again. The interrupt stays assigned to a real vector
until pci_disable_msi() is invoked and all allocations are undone.

Fixes: 4900be8360 ("x86/vector/msi: Switch to global reservation mode")
Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Sakari Ailus <sakari.ailus@intel.com>,
Cc: linux-media@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
2017-12-29 21:13:05 +01:00
Thomas Gleixner
702cb0a028 genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
The 'early' argument of irq_domain_activate_irq() is actually used to
denote reservation mode. To avoid confusion, rename it before abuse
happens.

No functional change.

Fixes: 7249164346 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Sakari Ailus <sakari.ailus@intel.com>,
Cc: linux-media@vger.kernel.org
2017-12-29 21:13:04 +01:00
Thomas Gleixner
69790ba92b genirq: Introduce IRQD_CAN_RESERVE flag
Add a new flag to mark interrupts which can use reservation mode. This is
going to be used in subsequent patches to disable reservation mode for a
certain class of MSI devices.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Sakari Ailus <sakari.ailus@intel.com>,
Cc: linux-media@vger.kernel.org
2017-12-29 21:13:04 +01:00
Thomas Gleixner
da5dd9e854 genirq/msi: Handle reactivation only on success
When analyzing the fallout of the x86 vector allocation rework it turned
out that the error handling in msi_domain_alloc_irqs() is broken.

If MSI_FLAG_MUST_REACTIVATE is set for a MSI domain then it clears the
activation flag for a successfully initialized msi descriptor. If a
subsequent initialization fails then the error handling code path does not
deactivate the interrupt because the activation flag got cleared.

Move the clearing of the activation flag outside of the initialization loop
so that an eventual failure can be cleaned up correctly.

Fixes: 22d0b12f35 ("genirq/irqdomain: Add force reactivation flag to irq domains")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Mikael Pettersson <mikpelinux@gmail.com>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Sakari Ailus <sakari.ailus@intel.com>,
Cc: linux-media@vger.kernel.org
2017-12-29 21:13:04 +01:00
Guenter Roeck
11bca0a83f genirq: Guard handle_bad_irq log messages
An interrupt storm on a bad interrupt will cause the kernel
log to be clogged.

[   60.089234] ->handle_irq():  ffffffffbe2f803f,
[   60.090455] 0xffffffffbf2af380
[   60.090510] handle_bad_irq+0x0/0x2e5
[   60.090522] ->irq_data.chip(): ffffffffbf2af380,
[   60.090553]    IRQ_NOPROBE set
[   60.090584] ->handle_irq():  ffffffffbe2f803f,
[   60.090590] handle_bad_irq+0x0/0x2e5
[   60.090596] ->irq_data.chip(): ffffffffbf2af380,
[   60.090602] 0xffffffffbf2af380
[   60.090608] ->action():           (null)
[   60.090779] handle_bad_irq+0x0/0x2e5

This was seen when running an upstream kernel on Acer Chromebook R11.  The
system was unstable as result.

Guard the log message with __printk_ratelimit to reduce the impact.  This
won't prevent the interrupt storm from happening, but at least the system
remains stable.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Torokhov <dtor@chromium.org>
Cc: Joe Perches <joe@perches.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=197953
Link: https://lkml.kernel.org/r/1512234784-21038-1-git-send-email-linux@roeck-us.net
2017-12-28 12:28:29 +01:00
Andrew Lunn
39c3fd5895 kernel/irq: Extend lockdep class for request mutex
The IRQ code already has support for lockdep class for the lock mutex
in an interrupt descriptor. Extend this to add a second class for the
request mutex in the descriptor. Not having a class is resulting in
false positive splats in some code paths.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: linus.walleij@linaro.org
Cc: grygorii.strashko@ti.com
Cc: f.fainelli@gmail.com
Link: https://lkml.kernel.org/r/1512234664-21555-1-git-send-email-andrew@lunn.ch
2017-12-28 12:26:35 +01:00
Thomas Gleixner
bb5c434282 genirq/matrix: Fix the precedence fix for real
The previous commit which made the operator precedence in
irq_matrix_available() explicit made the implicit brokenness explicitely
wrong. It was wrong in the original commit already. The overworked
maintainer did not notice it either when merging the patch.

Replace the confusing '?' construct by a simple and obvious if ().

Fixes: 75f1133873 ("genirq/matrix: Make - vs ?: Precedence explicit")
Reported-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
2017-12-04 20:50:35 +01:00
Linus Torvalds
dec0029a59 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Glexiner:

 - unbreak the irq trigger type check for legacy platforms

 - a handful fixes for ARM GIC v3/4 interrupt controllers

 - a few trivial fixes all over the place

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/matrix: Make - vs ?: Precedence explicit
  irqchip/imgpdc: Use resource_size function on resource object
  irqchip/qcom: Fix u32 comparison with value less than zero
  irqchip/exiu: Fix return value check in exiu_init()
  irqchip/gic-v3-its: Remove artificial dependency on PCI
  irqchip/gic-v4: Add forward definition of struct irq_domain_ops
  irqchip/gic-v3: pr_err() strings should end with newlines
  irqchip/s3c24xx: pr_err() strings should end with newlines
  irqchip/gic-v3: Fix ppi-partitions lookup
  irqchip/gic-v4: Clear IRQ_DISABLE_UNLAZY again if mapping fails
  genirq: Track whether the trigger type has been set
2017-11-26 14:39:20 -08:00
Kees Cook
75f1133873 genirq/matrix: Make - vs ?: Precedence explicit
Noticed with a Clang build. This improves the readability of the ?:
expression, as it has lower precedence than the - expression. Show
explicitly that - is evaluated first.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20171122205645.GA27125@beast
2017-11-23 20:09:31 +01:00
Kees Cook
24ed960abf treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
This changes all DEFINE_TIMER() callbacks to use a struct timer_list
pointer instead of unsigned long. Since the data argument has already been
removed, none of these callbacks are using their argument currently, so
this renames the argument to "unused".

Done using the following semantic patch:

@match_define_timer@
declarer name DEFINE_TIMER;
identifier _timer, _callback;
@@

 DEFINE_TIMER(_timer, _callback);

@change_callback depends on match_define_timer@
identifier match_define_timer._callback;
type _origtype;
identifier _origarg;
@@

 void
-_callback(_origtype _origarg)
+_callback(struct timer_list *unused)
 { ... }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:05 -08:00
Linus Torvalds
c9b012e5f4 arm64 updates for 4.15
Plenty of acronym soup here:
 
 - Initial support for the Scalable Vector Extension (SVE)
 - Improved handling for SError interrupts (required to handle RAS events)
 - Enable GCC support for 128-bit integer types
 - Remove kernel text addresses from backtraces and register dumps
 - Use of WFE to implement long delay()s
 - ACPI IORT updates from Lorenzo Pieralisi
 - Perf PMU driver for the Statistical Profiling Extension (SPE)
 - Perf PMU driver for Hisilicon's system PMUs
 - Misc cleanups and non-critical fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJaCcLqAAoJELescNyEwWM0JREH/2FbmD/khGzEtP8LW+o9D8iV
 TBM02uWQxS1bbO1pV2vb+512YQO+iWfeQwJH9Jv2FZcrMvFv7uGRnYgAnJuXNGrl
 W+LL6OhN22A24LSawC437RU3Xe7GqrtONIY/yLeJBPablfcDGzPK1eHRA0pUzcyX
 VlyDruSHWX44VGBPV6JRd3x0vxpV8syeKOjbRvopRfn3Nwkbd76V3YSfEgwoTG5W
 ET1sOnXLmHHdeifn/l1Am5FX1FYstpcd7usUTJ4Oto8y7e09tw3bGJCD0aMJ3vow
 v1pCUWohEw7fHqoPc9rTrc1QEnkdML4vjJvMPUzwyTfPrN+7uEuMIEeJierW+qE=
 =0qrg
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The big highlight is support for the Scalable Vector Extension (SVE)
  which required extensive ABI work to ensure we don't break existing
  applications by blowing away their signal stack with the rather large
  new vector context (<= 2 kbit per vector register). There's further
  work to be done optimising things like exception return, but the ABI
  is solid now.

  Much of the line count comes from some new PMU drivers we have, but
  they're pretty self-contained and I suspect we'll have more of them in
  future.

  Plenty of acronym soup here:

   - initial support for the Scalable Vector Extension (SVE)

   - improved handling for SError interrupts (required to handle RAS
     events)

   - enable GCC support for 128-bit integer types

   - remove kernel text addresses from backtraces and register dumps

   - use of WFE to implement long delay()s

   - ACPI IORT updates from Lorenzo Pieralisi

   - perf PMU driver for the Statistical Profiling Extension (SPE)

   - perf PMU driver for Hisilicon's system PMUs

   - misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
  arm64: Make ARMV8_DEPRECATED depend on SYSCTL
  arm64: Implement __lshrti3 library function
  arm64: support __int128 on gcc 5+
  arm64/sve: Add documentation
  arm64/sve: Detect SVE and activate runtime support
  arm64/sve: KVM: Hide SVE from CPU features exposed to guests
  arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
  arm64/sve: KVM: Prevent guests from using SVE
  arm64/sve: Add sysctl to set the default vector length for new processes
  arm64/sve: Add prctl controls for userspace vector length management
  arm64/sve: ptrace and ELF coredump support
  arm64/sve: Preserve SVE registers around EFI runtime service calls
  arm64/sve: Preserve SVE registers around kernel-mode NEON use
  arm64/sve: Probe SVE capabilities and usable vector lengths
  arm64: cpufeature: Move sys_caps_initialised declarations
  arm64/sve: Backend logic for setting the vector length
  arm64/sve: Signal handling support
  arm64/sve: Support vector length resetting for new processes
  arm64/sve: Core task context handling
  arm64/sve: Low-level CPU setup
  ...
2017-11-15 10:56:56 -08:00
Thomas Gleixner
41cc30412d irqchip updates for 4.15, take #4
- A core irq fix for legacy cases where the irq trigger is not reported
   by firmware
 - A couple of GICv3/4 fixes (Kconfig, of-node refcount, error handling)
 - Trivial pr_err fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAloJ2UYVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDtkcP/i2HrgT5CmoA315h5HLXfFImxZwg
 YG7ti3zqu1TM7HXNlfBgSltqiU5UgYOd2KjgUR+w30t1239DwxVdtoy6ZgGWXlfe
 xmHlXAiEDkqaAV3P/RHKkl8xRDkyWhKmfY1ym/So0J68EPytuiZiT4wJZO39u/xk
 Gi2fW38gywl4bEb9MA48noEnkNkr93DjJfFyiPxPYiAaWeFRoP+X5l171kTpCdSv
 sT5hfQvfsz7mQZVecMBwGg6QuWnlUplkzPb4lNhuk/tJTU97vf3Py8abzeiGqBMa
 5NoH9PUMZs0L8kv6DgR2DEOmhmmk0+mwO7VWVT3eVtYoxSS8oW0i2OzgB9kCOy8+
 LZi+M0fMSL6hiAgwlV5YrHYvsVl2rGjCNAqC+q2ggpW4XH9Kql2caki331mGqAyr
 qDeI0SWaABz5c9ajtsfQ78T0FE70xwIJ5TT1RmtKt2LFCpJjBacjK3vUST2ypIrF
 k0AJTmBwEefKNvxTrjkZtenblCVo92lT+boAv7HuI612NicVSBEsK7S+96kGlGMw
 KNi+cwGriJRWIbsn+VMpLFGyPgEmWQ1p/4Z/tkwSevqOl/k8KrNw6+VoyDS1PmJU
 gIQ0hh8wMnm2l9VGJARc0+pHKr0KYN52j+jZI5+OGNKKAZ0jGYI4t2JiemtOwmet
 q2KNdm6mvBYUYirj
 =/8DQ
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip updates for 4.15, take #4 from Marc Zyngier

 - A core irq fix for legacy cases where the irq trigger is not reported
   by firmware

 - A couple of GICv3/4 fixes (Kconfig, of-node refcount, error handling)

 - Trivial pr_err fixes
2017-11-14 11:23:05 +01:00
Linus Torvalds
b18d62891a Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 APIC updates from Thomas Gleixner:
 "This update provides a major overhaul of the APIC initialization and
  vector allocation code:

   - Unification of the APIC and interrupt mode setup which was
     scattered all over the place and was hard to follow. This also
     distangles the timer setup from the APIC initialization which
     brings a clear separation of functionality.

     Great detective work from Dou Lyiang!

   - Refactoring of the x86 vector allocation mechanism. The existing
     code was based on nested loops and rather convoluted APIC callbacks
     which had a horrible worst case behaviour and tried to serve all
     different use cases in one go. This led to quite odd hacks when
     supporting the new managed interupt facility for multiqueue devices
     and made it more or less impossible to deal with the vector space
     exhaustion which was a major roadblock for server hibernation.

     Aside of that the code dealing with cpu hotplug and the system
     vectors was disconnected from the actual vector management and
     allocation code, which made it hard to follow and maintain.

     Utilizing the new bitmap matrix allocator core mechanism, the new
     allocator and management code consolidates the handling of system
     vectors, legacy vectors, cpu hotplug mechanisms and the actual
     allocation which needs to be aware of system and legacy vectors and
     hotplug constraints into a single consistent entity.

     This has one visible change: The support for multi CPU targets of
     interrupts, which is only available on a certain subset of
     CPUs/APIC variants has been removed in favour of single interrupt
     targets. A proper analysis of the multi CPU target feature revealed
     that there is no real advantage as the vast majority of interrupts
     end up on the CPU with the lowest APIC id in the set of target CPUs
     anyway. That change was agreed on by the relevant folks and allowed
     to simplify the implementation significantly and to replace rather
     fragile constructs like the vector cleanup IPI with straight
     forward and solid code.

     Furthermore this allowed to cleanly separate the allocation details
     for legacy, normal and managed interrupts:

      * Legacy interrupts are not longer wasting 16 vectors
        unconditionally

      * Managed interrupts have now a guaranteed vector reservation, but
        the actual vector assignment happens when the interrupt is
        requested. It's guaranteed not to fail.

      * Normal interrupts no longer allocate vectors unconditionally
        when the interrupt is set up (IO/APIC init or MSI(X) enable).
        The mechanism has been switched to a best effort reservation
        mode. The actual allocation happens when the interrupt is
        requested. Contrary to managed interrupts the request can fail
        due to vector space exhaustion, but drivers must handle a fail
        of request_irq() anyway. When the interrupt is freed, the vector
        is handed back as well.

        This solves a long standing problem with large unconditional
        vector allocations for a certain class of enterprise devices
        which prevented server hibernation due to vector space
        exhaustion when the unused allocated vectors had to be migrated
        to CPU0 while unplugging all non boot CPUs.

     The code has been equipped with trace points and detailed debugfs
     information to aid analysis of the vector space"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE
  PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code
  genirq: Add config option for reservation mode
  x86/vector: Use correct per cpu variable in free_moved_vector()
  x86/apic/vector: Ignore set_affinity call for inactive interrupts
  x86/apic: Fix spelling mistake: "symmectic" -> "symmetric"
  x86/apic: Use dead_cpu instead of current CPU when cleaning up
  ACPI/init: Invoke early ACPI initialization earlier
  x86/vector: Respect affinity mask in irq descriptor
  x86/irq: Simplify hotplug vector accounting
  x86/vector: Switch IOAPIC to global reservation mode
  x86/vector/msi: Switch to global reservation mode
  x86/vector: Handle managed interrupts proper
  x86/io_apic: Reevaluate vector configuration on activate()
  iommu/amd: Reevaluate vector configuration on activate()
  iommu/vt-d: Reevaluate vector configuration on activate()
  x86/apic/msi: Force reactivation of interrupts at startup time
  x86/vector: Untangle internal state from irq_cfg
  x86/vector: Compile SMP only code conditionally
  x86/apic: Remove unused callbacks
  ...
2017-11-13 18:29:23 -08:00
Linus Torvalds
2bcc673101 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Yet another big pile of changes:

   - More year 2038 work from Arnd slowly reaching the point where we
     need to think about the syscalls themself.

   - A new timer function which allows to conditionally (re)arm a timer
     only when it's either not running or the new expiry time is sooner
     than the armed expiry time. This allows to use a single timer for
     multiple timeout requirements w/o caring about the first expiry
     time at the call site.

   - A new NMI safe accessor to clock real time for the printk timestamp
     work. Can be used by tracing, perf as well if required.

   - A large number of timer setup conversions from Kees which got
     collected here because either maintainers requested so or they
     simply got ignored. As Kees pointed out already there are a few
     trivial merge conflicts and some redundant commits which was
     unavoidable due to the size of this conversion effort.

   - Avoid a redundant iteration in the timer wheel softirq processing.

   - Provide a mechanism to treat RTC implementations depending on their
     hardware properties, i.e. don't inflict the write at the 0.5
     seconds boundary which originates from the PC CMOS RTC to all RTCs.
     No functional change as drivers need to be updated separately.

   - The usual small updates to core code clocksource drivers. Nothing
     really exciting"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (111 commits)
  timers: Add a function to start/reduce a timer
  pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday()
  timer: Prepare to change all DEFINE_TIMER() callbacks
  netfilter: ipvs: Convert timers to use timer_setup()
  scsi: qla2xxx: Convert timers to use timer_setup()
  block/aoe: discover_timer: Convert timers to use timer_setup()
  ide: Convert timers to use timer_setup()
  drbd: Convert timers to use timer_setup()
  mailbox: Convert timers to use timer_setup()
  crypto: Convert timers to use timer_setup()
  drivers/pcmcia: omap1: Fix error in automated timer conversion
  ARM: footbridge: Fix typo in timer conversion
  drivers/sgi-xp: Convert timers to use timer_setup()
  drivers/pcmcia: Convert timers to use timer_setup()
  drivers/memstick: Convert timers to use timer_setup()
  drivers/macintosh: Convert timers to use timer_setup()
  hwrng/xgene-rng: Convert timers to use timer_setup()
  auxdisplay: Convert timers to use timer_setup()
  sparc/led: Convert timers to use timer_setup()
  mips: ip22/32: Convert timers to use timer_setup()
  ...
2017-11-13 17:56:58 -08:00
Linus Torvalds
670310dfba Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq core updates from Thomas Gleixner:
 "A rather large update for the interrupt core code and the irq chip drivers:

   - Add a new bitmap matrix allocator and supporting changes, which is
     used to replace the x86 vector allocator which comes with separate
     pull request. This allows to replace the convoluted nested loop
     allocation function in x86 with a facility which supports the
     recently added property of managed interrupts proper and allows to
     switch to a best effort vector reservation scheme, which addresses
     problems with vector exhaustion.

   - A large update to the ARM GIC-V3-ITS driver adding support for
     range selectors.

   - New interrupt controllers:
       - Meson and Meson8 GPIO
       - BCM7271 L2
       - Socionext EXIU

     If you expected that this will stop at some point, I have to
     disappoint you. There are new ones posted already. Sigh!

   - STM32 interrupt controller support for new platforms.

   - A pile of fixes, cleanups and updates to the MIPS GIC driver

   - The usual small fixes, cleanups and updates all over the place.
     Most visible one is to move the irq chip drivers Kconfig switches
     into a separate Kconfig menu"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  genirq: Fix type of shifting literal 1 in __setup_irq()
  irqdomain: Drop pointless NULL check in virq_debug_show_one
  genirq/proc: Return proper error code when irq_set_affinity() fails
  irq/work: Use llist_for_each_entry_safe
  irqchip: mips-gic: Print warning if inherited GIC base is used
  irqchip/mips-gic: Add pr_fmt and reword pr_* messages
  irqchip/stm32: Move the wakeup on interrupt mask
  irqchip/stm32: Fix initial values
  irqchip/stm32: Add stm32h7 support
  dt-bindings/interrupt-controllers: Add compatible string for stm32h7
  irqchip/stm32: Add multi-bank management
  irqchip/stm32: Select GENERIC_IRQ_CHIP
  irqchip/exiu: Add support for Socionext Synquacer EXIU controller
  dt-bindings: Add description of Socionext EXIU interrupt controller
  irqchip/gic-v3-its: Fix VPE activate callback return value
  irqchip: mips-gic: Make IPI bitmaps static
  irqchip: mips-gic: Share register writes in gic_set_type()
  irqchip: mips-gic: Remove gic_vpes variable
  irqchip: mips-gic: Use num_possible_cpus() to reserve IPIs
  irqchip: mips-gic: Configure EIC when CPUs come online
  ...
2017-11-13 17:33:11 -08:00
Rasmus Villemoes
ffc661c99f genirq: Fix type of shifting literal 1 in __setup_irq()
If ffz() ever returns a value >= 31 then the following shift is undefined
behaviour because the literal 1 which gets shifted is treated as signed
integer.

In practice, the bug is probably harmless, since the first undefined shift
count is 31 which results - ignoring UB - in (int)(0x80000000). This gets
sign extended so bit 32-63 will be set as well and all subsequent
__setup_irq() calls would just end up hitting the -EBUSY branch.

However, a sufficiently aggressive optimizer may use the UB of 1<<31
to decide that doesn't happen, and hence elide the sign-extension
code, so that subsequent calls can indeed get ffz > 31.

In any case, the right thing to do is to make the literal 1UL.

[ tglx: For this to happen a single interrupt would have to be shared by 32
  	devices. Hardware like that does not exist and would have way more
  	problems than that. ]

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20171030213548.16831-1-linux@rasmusvillemoes.dk
2017-11-12 23:25:40 +01:00
Rasmus Villemoes
306eb5a38d irqdomain: Drop pointless NULL check in virq_debug_show_one
data has been already derefenced unconditionally, so it's pointless to do a
NULL pointer check on it afterwards. Drop it.

[ tglx: Depersonify changelog. ]

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20171112212904.28574-1-linux@rasmusvillemoes.dk
2017-11-12 23:25:40 +01:00
Wen Yaxng
6714796edc genirq/proc: Return proper error code when irq_set_affinity() fails
write_irq_affinity() returns the number of written bytes, which means
success, unconditionally whether the actual irq_set_affinity() call
succeeded or not.

Add proper error handling and pass the error code returned from
irq_set_affinity() back to user space in case of failure.

[ tglx: Fixed coding style and massaged changelog ]

Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Cc: zhong.weidong@zte.com.cn
Link: https://lkml.kernel.org/r/1510106103-184761-1-git-send-email-wen.yang99@zte.com.cn
2017-11-12 23:25:39 +01:00
Marc Zyngier
4f8413a3a7 genirq: Track whether the trigger type has been set
When requesting a shared interrupt, we assume that the firmware
support code (DT or ACPI) has called irqd_set_trigger_type
already, so that we can retrieve it and check that the requester
is being reasonnable.

Unfortunately, we still have non-DT, non-ACPI systems around,
and these guys won't call irqd_set_trigger_type before requesting
the interrupt. The consequence is that we fail the request that
would have worked before.

We can either chase all these use cases (boring), or address it
in core code (easier). Let's have a per-irq_desc flag that
indicates whether irqd_set_trigger_type has been called, and
let's just check it when checking for a shared interrupt.
If it hasn't been set, just take whatever the interrupt
requester asks.

Fixes: 382bd4de61 ("genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs")
Cc: stable@vger.kernel.org
Reported-and-tested-by: Petr Cvek <petrcvekcz@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-10 09:49:48 +00:00
Frederic Weisbecker
a934d4d15f irq/timings: Use lockdep to assert IRQs are disabled/enabled
Use lockdep to check that IRQs are enabled or disabled as expected. This
way the sanity check only shows overhead when concurrency correctness
debug code is enabled.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David S . Miller <davem@davemloft.net>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1509980490-4285-10-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-08 11:13:52 +01:00
Ingo Molnar
141d3b1daa Merge branch 'linus' into x86/apic, to resolve conflicts
Conflicts:
	arch/x86/include/asm/x2apic.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:51:10 +01:00
Thomas Gleixner
722c908f84 irqchip updates for 4.15, take #2
- A number of MIPS GIC updates and cleanups
 - One GICv4 update
 - Another firmware workaround for GICv2
 - Support for Mason8 GPIOs
 - Tiny documentation fix
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAln7UeAVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDtLkQAII8Ij6h2nWN8MaSOcrOsar58PU9
 gfa8HidjH7QRht5xcpaAYXO0+VO7etP37JpTlzcJfpaAU8tFTKAMJKbWaHRe2KKV
 YbfRYWhnyizmb2Sn5+JDKlVVPAgR9tZN9UQyDceERb4yZmGOc9nHBObJIPAYtvnr
 zKLPo5ldVs7bCQan+uGXMyncVGgJ70rYCNPhHqzyNbAUgkI3XA85Wk9NgBXQCr/2
 8SiE6hs+QdkELy545Pi0oPxvqY5q4f2QmgOjWhQtJgbAQvWpizMi2Qz4zH8Gs89t
 Iln6zaw8315vfPSiyXckbIlMoUUzejLuxtvwvdgdrjE5vogNkj+Z874R3yNbLZ2H
 CW+DZz9gv+e2uzIiqBBpw1DatWBQQxoXvUVW1fyEOqwZPpr7JvAKwJ9Nwhc0kUdG
 H2AjO1KL/WqiqCpEDAYF+zY3zLUSAaoNjMWX8+nlaUstgQO4hSyPyU3RUJXmpHf8
 DHIVelZlhurKjQP6SfH1QKs+Wqv2QAZb8Otb8Pt/IMw73mXmRMbMPB7uOc2A7TqG
 uceAQuOyVBG+QF2dGiXZ5RqTNtL1gWns+S+abJCtRkD+JV83teZw3I4sMnVpeqtO
 rdavJVejrePck8Tgd1qOCGfXUQrBhCG1xJFkt/Y9tWbduU0R4V1y5J7LcGBLMYSy
 cV5WD3fejfjayTlT
 =Aozt
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull the second batch of irqchip updates for 4.15 from marc Zyngier:

 - A number of MIPS GIC updates and cleanups
 - One GICv4 update
 - Another firmware workaround for GICv2
 - Support for Mason8 GPIOs
 - Tiny documentation fix
2017-11-02 19:18:08 +01:00
Marc Zyngier
05f3647359 Linux 4.14-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZ0WQ6AAoJEHm+PkMAQRiGuloH/3sF4qfBhPuJo8OTf0uCtQ18
 4Ux9zZbm81df/Jjz0exAp1Jqk+TvdIS3OXPWcKilvbUBP16hQcsxFTnI/5QF+YcN
 87aNr+OCMJzOBK4suN1yhzO46NYHeIizdB0PTZVL1Zsto69Tt31D8VJmgH6oBxAw
 Isb/nAkOr31dZ9PI5UEExTIanUt6EywVb0UswA+2rNl3h1UkeasQCpMpK2n6HBhU
 kVD7sxEd/CN0MmfhB0HrySSam/BeSpOtzoU9bemOwrU2uu9+5+2rqMe7Gsdj4nX6
 3Kk+7FQNktlrhxCZIFN/+CdusOUuDd8r/75d7DnsRK5YvSb0sZzJkfD3Nba68Ms=
 =7J2+
 -----END PGP SIGNATURE-----

Merge tag 'v4.14-rc3' into irq/irqchip-4.15

Required merge to get mainline irqchip updates.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-02 15:54:58 +00:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Rakib Mullick
10d94ff4d5 irq/core: Fix boot crash when the irqaffinity= boot parameter is passed on CPUMASK_OFFSTACK=y kernels(v1)
When the irqaffinity= kernel parameter is passed in a CPUMASK_OFFSTACK=y
kernel, it fails to boot, because zalloc_cpumask_var() cannot be used before
initializing the slab allocator to allocate a cpumask.

So, use alloc_bootmem_cpumask_var() instead.

Also do some cleanups while at it: in init_irq_default_affinity() remove
an #ifdef via using cpumask_available().

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171026045800.27087-1-rakib.mullick@gmail.com
Link: http://lkml.kernel.org/r/20171101041451.12581-1-rakib.mullick@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-01 09:56:39 +01:00
Christoffer Dall
250a53d6fc genirq: Document vcpu_info usage for percpu_devid interrupts
It is currently unclear how to set the VCPU affinity for a percpu_devid
interrupt , since the Linux irq_data structure describes the state for
multiple interrupts, one for each physical CPU on the system.  Since
each such interrupt can be associated with different VCPUs or none at
all, associating a single VCPU state with such an interrupt does not
capture the necessary semantics.

The implementers of irq_set_affinity are the Intel and AMD IOMMUs, and
the ARM GIC irqchip.  The Intel and AMD callers do not appear to use
percpu_devid interrupts, and the ARM GIC implementation only checks the
pointer against NULL vs. non-NULL.

Therefore, simply update the function documentation to explain the
expected use in the context of percpu_devid interrupts, allowing future
changes or additions to irqchip implementers to do the right thing.

Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvm@vger.kernel.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lkml.kernel.org/r/1509093281-15225-13-git-send-email-cdall@linaro.org
2017-10-29 16:18:37 +01:00
Masahiro Yamada
d03cc2d8ae irqdomain: Add __rcu annotations to radix tree slot
Fix different address spaces warning of sparse.

kernel/irq/irqdomain.c:1463:14: warning: incorrect type in assignment (different address spaces)
kernel/irq/irqdomain.c:1463:14:    expected void **slot
kernel/irq/irqdomain.c:1463:14:    got void [noderef] <asn:4>**
kernel/irq/irqdomain.c:1465:66: warning: incorrect type in argument 2 (different address spaces)
kernel/irq/irqdomain.c:1465:66:    expected void [noderef] <asn:4>**slot
kernel/irq/irqdomain.c:1465:66:    got void **slot

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-19 10:30:15 +01:00
Masahiro Yamada
f1d7835854 irqdomain: Move revmap_trees_mutex to struct irq_domain
The revmap_trees_mutex protects domain->revmap_tree.  There is no
need to make it global because it is allowed to modify revmap_tree
of two different domains concurrently.  Having said that, this would
not be a actual bottleneck because the interrupt map/unmap does not
occur quite often.  Rather, the motivation is to tidy up the code
from a data structure point of view.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-19 10:30:03 +01:00
Thomas Gleixner
2b5175c4fa genirq: Add config option for reservation mode
The interrupt reservation mode requires reactivation of PCI/MSI
interrupts. Create a config option, so the PCI code can set the
corresponding flag when required.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poulson <jopoulso@microsoft.com>
Cc: Mihai Costache <v-micos@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-pci@vger.kernel.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Simon Xiao <sixiao@microsoft.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jork Loeser <Jork.Loeser@microsoft.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: devel@linuxdriverproject.org
Cc: KY Srinivasan <kys@microsoft.com>
Link: https://lkml.kernel.org/r/20171017075600.369375409@linutronix.de
2017-10-18 15:38:30 +02:00
Will Deacon
5ffeb0501c genirq: export irq_get_percpu_devid_partition to modules
Any modular driver using cluster-affine PPIs needs to be able to call
irq_get_percpu_devid_partition so that it can enable the IRQ on the
correct subset of CPUs.

This patch exports the symbol so that it can be called from within a
module.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:29 +01:00
Thomas Gleixner
3d51969ce3 irqchip updates for 4.14-rc5
- Fix unfortunate mistake in the GICv3 ITS binding example
 - Two fixes for the recently merged GICv4 support
 - GICv3 ITS 52bit PA fixes
 - Generic irqchip mask-ack fix, and its application to the tango irqchip
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlng8XwVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDllYQAMbrw2Y5kvJ1M1jm3an9cmZ6G11y
 X/BBuEJ4cqfDTGqrZSAhsTeGuzeWOjkA0GWEDWSC1MgHJEIor53blTDNuU10fDng
 2jTYCdajXzLJufMwy42IuR0H7OrcRWdSElDhxaoVxlP5/S02iyxvpWnInfDf1TTX
 EpmhFQORYinNTP9+d3lyPdiBLia+N38OaH7ahOCLyHAWIOJNQYcX1bPA6fNkZs7X
 GH4jjyC6DJgfeYCqWBU1qE4U6ENftdxIOjIm93Ax2QElx1srJFWzDGTbDmyxY5YT
 5SQAfVWmR0I3cJ10TqurSTzIXF+pJoKsU8sZSVbM6wLQgQefi8fnP5jqLBji/PiO
 29MUQrG1DshooK4PqDmOS7PN3LPlT3YZelpZ9yyZB7qqW5lNvQVb6elQrFUC0FbG
 t5JUqmxeR1lksq0O+BzQeDaivOtMAqqe5eaUW6cMeb17DV+gK2rW2m+gJQ0wx5yq
 5DuOGmMepebC3DyvSZaZyJtf55N4gmK8BRNpunA4Qtrx51YchHScJugmj/T8udgt
 wf/RuK/dsTzswuexP0FtvvFXwoFSa7SlDzXGhSFsSS6dCIo9Rkw7DuLKY2TQvzwA
 EUlaJ9RvSofrqgyrNVsRVJUi+/LIlFdbiC4vF0rnstqW4RghkZrHPB2wHbEskPiC
 BO6nyWbrT+4hR3Kz
 =yAAX
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip updates for 4.14-rc5 from Marc Zyngier:

- Fix unfortunate mistake in the GICv3 ITS binding example
- Two fixes for the recently merged GICv4 support
- GICv3 ITS 52bit PA fixes
- Generic irqchip mask-ack fix, and its application to the tango irqchip
2017-10-16 10:26:46 +02:00
Doug Berger
0d08af35f1 genirq: generic chip: remove irq_gc_mask_disable_reg_and_ack()
Any usage of the irq_gc_mask_disable_reg_and_ack() function has
been replaced with the desired functionality.

The incorrect and ambiguously named function is removed here to
prevent accidental misuse.

Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-13 16:31:05 +01:00
Doug Berger
20608924cc genirq: generic chip: Add irq_gc_mask_disable_and_ack_set()
The irq_gc_mask_disable_reg_and_ack() function name implies that it
provides the combined functions of irq_gc_mask_disable_reg() and
irq_gc_ack().  However, the implementation does not actually do
that since it writes the mask instead of the disable register. It
also does not maintain the mask cache which makes it inappropriate
to use with other masking functions.

In addition, commit 659fb32d1b ("genirq: replace irq_gc_ack() with
{set,clr}_bit variants (fwd)") effectively renamed irq_gc_ack() to
irq_gc_ack_set_bit() so this function probably should have also been
renamed at that time.

The generic chip code currently provides three functions for use
with the irq_mask member of the irq_chip structure and two functions
for use with the irq_ack member of the irq_chip structure. These
functions could be combined into six functions for use with the
irq_mask_ack member of the irq_chip structure.  However, since only
one of the combinations is currently used, only the function
irq_gc_mask_disable_and_ack_set() is added by this commit.

The '_reg' and '_bit' portions of the base function name were left
out of the new combined function name in an attempt to keep the
function name length manageable with the 80 character source code
line length while still allowing the distinct aspects of each
combination to be captured by the name.

If other combinations are desired in the future please add them to
the irq generic chip library at that time.

Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-10-13 16:31:05 +01:00
Thomas Gleixner
331b57d148 Merge branch 'irq/urgent' into x86/apic
Pick up core changes which affect the vector rework.
2017-10-12 11:02:50 +02:00
Thomas Gleixner
e43b3b5854 genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs
Managed interrupts can end up in a stale state on CPU hotplug. If the
interrupt is not targeting a single CPU, i.e. the affinity mask spawns
multiple CPUs then the following can happen:

After boot:

dstate:   0x01601200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  0

After offlining CPU 31 - 24

dstate:   0x01a31000
            IRQD_IRQ_DISABLED
            IRQD_IRQ_MASKED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_AFFINITY_MANAGED
            IRQD_MANAGED_SHUTDOWN
node:     0
affinity: 24-31
effectiv: 24
pending:  0

Now CPU 25 gets onlined again, so it should get the effective interrupt
affinity for this interruopt, but due to the x86 interrupt affinity setter
restrictions this ends up after restarting the interrupt with:

dstate:   0x01601300
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_SETAFFINITY_PENDING
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  24-31

So the interrupt is still affine to CPU 24, which was the last CPU to go
offline of that affinity set and the move to an online CPU within 24-31,
in this case 25, is pending. This mechanism is x86/ia64 specific as those
architectures cannot move interrupts from thread context and do this when
an interrupt is actually handled. So the move is set to pending.

Whats worse is that offlining CPU 25 again results in:

dstate:   0x01601300
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_SETAFFINITY_PENDING
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 24
pending:  24-31

This means the interrupt has not been shut down, because the outgoing CPU
is not in the effective affinity mask, but of course nothing notices that
the effective affinity mask is pointing at an offline CPU.

In the case of restarting a managed interrupt the move restriction does not
apply, so the affinity setting can be made unconditional. This needs to be
done _before_ the interrupt is started up as otherwise the condition for
moving it from thread context would not longer be fulfilled.

With that change applied onlining CPU 25 after offlining 31-24 results in:

dstate:   0x01600200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_MANAGED
node:     0
affinity: 24-31
effectiv: 25
pending:  

And after offlining CPU 25:

dstate:   0x01a30000
            IRQD_IRQ_DISABLED
            IRQD_IRQ_MASKED
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_MANAGED
            IRQD_MANAGED_SHUTDOWN
node:     0
affinity: 24-31
effectiv: 25
pending:  

which is the correct and expected result.

Fixes: 761ea388e8 ("genirq: Handle managed irqs gracefully in irq_startup()")
Reported-by: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: axboe@kernel.dk
Cc: linux-scsi@vger.kernel.org
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: mpe@ellerman.id.au
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: keith.busch@intel.com
Cc: peterz@infradead.org
Cc: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
2017-10-09 13:26:48 +02:00
Thomas Gleixner
60b09c51bb genirq/cpuhotplug: Add sanity check for effective affinity mask
The effective affinity mask handling has no safety net when the mask is not
updated by the interrupt chip or the mask contains offline CPUs.

If that happens the CPU unplug code fails to migrate interrupts.

Add sanity checks and emit a warning when the mask contains only offline
CPUs.

Fixes: 415fcf1a22 ("genirq/cpuhotplug: Use effective affinity mask")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
2017-10-09 13:26:48 +02:00
Thomas Gleixner
19e1d4e947 genirq: Warn when effective affinity is not updated
Emit a one time warning when the effective affinity mask is enabled in
Kconfig, but the interrupt chip does not update the mask in its
irq_set_affinity() callback,

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710042208400.2406@nanos
2017-10-09 13:26:48 +02:00
Kees Cook
1d27e3e225 timer: Remove expires and data arguments from DEFINE_TIMER
Drop the arguments from the macro and adjust all callers with the
following script:

  perl -pi -e 's/DEFINE_TIMER\((.*), 0, 0\);/DEFINE_TIMER($1);/g;' \
    $(git grep DEFINE_TIMER | cut -d: -f1 | sort -u | grep -v timer.h)

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # for m68k parts
Acked-by: Guenter Roeck <linux@roeck-us.net> # for watchdog parts
Acked-by: David S. Miller <davem@davemloft.net> # for networking parts
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kalle Valo <kvalo@codeaurora.org> # for wireless parts
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-mips@linux-mips.org
Cc: Petr Mladek <pmladek@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux1394-devel@lists.sourceforge.net
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Michael Reed <mdr@sgi.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-pm@vger.kernel.org
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mark Gross <mark.gross@intel.com>
Cc: linux-watchdog@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: netdev@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lkml.kernel.org/r/1507159627-127660-11-git-send-email-keescook@chromium.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-10-05 15:01:20 +02:00
Jeffy Chen
72364d3206 irq/generic-chip: Don't replace domain's name
When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.

Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.

Remove the name assignment to prevent this.

Fixes: d59f6617ee ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
2017-09-28 12:18:59 +02:00
Masahiro Yamada
7755d83e48 irqdomain: Add __rcu annotations to radix tree accessors
Fix various address spaces warning of sparse.

kernel/irq/irqdomain.c:1463:14: warning: incorrect type in assignment (different address spaces)
kernel/irq/irqdomain.c:1463:14:    expected void **slot
kernel/irq/irqdomain.c:1463:14:    got void [noderef] <asn:4>**
kernel/irq/irqdomain.c:1465:66: warning: incorrect type in argument 2 (different address spaces)
kernel/irq/irqdomain.c:1465:66:    expected void [noderef] <asn:4>**slot
kernel/irq/irqdomain.c:1465:66:    got void **slot

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: https://lkml.kernel.org/r/1506082841-11530-1-git-send-email-yamada.masahiro@socionext.com
2017-09-25 21:23:44 +02:00
Thomas Gleixner
ec0f7cd273 genirq/matrix: Add tracepoints
Add tracepoints for the irq bitmap matrix allocator.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.279468022@linutronix.de
2017-09-25 20:38:26 +02:00
Thomas Gleixner
2f75d9e1c9 genirq: Implement bitmap matrix allocator
Implement the infrastructure for a simple bitmap based allocator, which
will replace the x86 vector allocator. It's in the core code as other
architectures might be able to reuse/extend it. For now it only implements
allocations for single CPUs, but it's simple to add multi CPU allocation
support if required.

The concept is rather simple:

 Global information:
 	system_vector bitmap
	global accounting

 PerCPU information:
 	allocation bitmap
	managed allocation bitmap
	local accounting

The system vector bitmap is used to exclude vectors system wide from the
allocation space.

The allocation bitmap is used to keep track of per cpu used vectors.

The managed allocation bitmap is used to reserve vectors for managed
interrupts.

When a regular (non managed) interrupt allocation happens then the
following rule applies:

      tmpmap = system_map | alloc_map | managed_map
      find_zero_bit(tmpmap)

Oring the bitmaps together gives the real available space. The same rule
applies for reserving a managed interrupt vector. But contrary to the
regular interrupts the reservation only marks the bit in the managed map
and therefor excludes it from the regular allocations. The managed map is
only cleaned out when the a managed interrupt is completely released and it
stays alive accross CPU offline/online operations.

For managed interrupt allocations the rule is:

      tmpmap = managed_map & ~alloc_map
      find_first_bit(tmpmap)

This returns the first bit which is in the managed map, but not yet
allocated in the allocation map. The allocation marks it in the allocation
map and hands it back to the caller for use.

The rest of the code are helper functions to handle the various
requirements and the accounting which are necessary to replace the x86
vector allocation code. The result is a single patch as the evolution of
this infrastructure cannot be represented in bits and pieces.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.185437174@linutronix.de
2017-09-25 20:38:26 +02:00
Thomas Gleixner
22d0b12f35 genirq/irqdomain: Add force reactivation flag to irq domains
Allow irqdomains to tell the core code, that after early activation the
interrupt needs to be reactivated at request_irq() time.

This allows reservation of vectors at early activation time and actual
vector assignment at request_irq() time.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.106242536@linutronix.de
2017-09-25 20:38:25 +02:00
Thomas Gleixner
42e1cc2dc5 genirq/irqdomain: Propagate early activation
Propagate the early activation mode to the irqdomain activate()
callbacks. This is required for the upcoming reservation, late vector
assignment scheme, so that the early activation call can act accordingly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213153.028353660@linutronix.de
2017-09-25 20:38:25 +02:00
Thomas Gleixner
bb9b428a5c genirq/irqdomain: Allow irq_domain_activate_irq() to fail
Allow irq_domain_activate_irq() to fail. This is required to support a
reservation and late vector assignment scheme.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.933882227@linutronix.de
2017-09-25 20:38:24 +02:00
Thomas Gleixner
7249164346 genirq/irqdomain: Update irq_domain_ops.activate() signature
The irq_domain_ops.activate() callback has no return value and no way to
tell the function that the activation is early.

The upcoming changes to support a reservation scheme which allows to assign
interrupt vectors on x86 only when the interrupt is actually requested
requires:

  - A return value, so activation can fail at request_irq() time
  
  - Information that the activate invocation is early, i.e. before
    request_irq().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.848490816@linutronix.de
2017-09-25 20:38:24 +02:00
Thomas Gleixner
c942cee46b genirq: Separate activation and startup
Activation of an interrupt and startup are currently a combo
functionlity. That works so far, but upcoming changes require a strict
separation because the activation can fail in future.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.754334077@linutronix.de
2017-09-25 20:38:24 +02:00
Thomas Gleixner
239306fee8 genirq: Set managed shut down flag at init
Managed interrupts should start up in managed shutdown mode. Set the status
flag when initialising the irq descriptor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.669687742@linutronix.de
2017-09-25 20:38:23 +02:00
Thomas Gleixner
457f6d3507 genirq: Make state consistent for !IRQ_DOMAIN_HIERARCHY
In the !IRQ_DOMAIN_HIERARCHY cas the activation stubs are not
setting/clearing the activation status bits. This is not a problem at the
moment, but upcoming changes require a correct status.

Add the set/clear incovations to the stub functions and move them to the
core internal header to avoid duplication and visibility outside the core.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.591985591@linutronix.de
2017-09-25 20:38:23 +02:00
Thomas Gleixner
c3e7239a7f irqdomain/debugfs: Provide domain specific debug callback
Some interrupt domains like the X86 vector domain has special requirements
for debugging, like showing the vector usage on the CPUs.

Add a callback to the irqdomain ops which can be filled in by domains which
require it and add conditional invocations to the irqdomain and the per irq
debug files.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.512937505@linutronix.de
2017-09-25 20:38:23 +02:00
Thomas Gleixner
07557ccb8c genirq/msi: Capture device name for debugfs
For debugging the allocation of unused or potentially leaked interrupt
descriptor it's helpful to have some information about the site which
allocated them. In case of MSI this is simple because the caller hands the
device struct pointer into the domain allocation function.

Duplicate the device name and show it in the debugfs entry of the interrupt
descriptor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.433038426@linutronix.de
2017-09-25 20:38:22 +02:00
Thomas Gleixner
e0b477941d genirq/debugfs: Show debug information for all irq descriptors
Currently the debugfs shows only information about actively used interrupts
like /proc/irq/ does. That's fine for most cases, but not helpful when
internals of allocated, but unused interrupt descriptors have to
debugged. It's also useful to provide information about all descriptors so
leaks can be debugged in a simpler way.

Move the debugfs registration to the descriptor allocation code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Yu Chen <yu.c.chen@intel.com>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: https://lkml.kernel.org/r/20170913213152.355525908@linutronix.de
2017-09-25 20:38:22 +02:00
Alexandru Moise
2827a418ca genirq: Check __free_irq() return value for NULL
__free_irq() can return a NULL irqaction for example when trying to free
already-free IRQ, but the callsite unconditionally dereferences the
returned pointer.

Fix this by adding a check and return NULL.

Signed-off-by: Alexandru Moise <00moses.alexander00@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170919200412.GA29985@gmail.com
2017-09-25 16:40:31 +02:00
Thomas Gleixner
9cb067ef8a genirq: Fix cpumask check in __irq_startup_managed()
The result of cpumask_any_and() is invalid when result greater or equal
nr_cpu_ids. The current check is checking for greater only. Fix it.

Fixes: 761ea388e8 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rui Zhang <rui.zhang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/20170913213152.272283444@linutronix.de
2017-09-16 20:20:56 +02:00
Linus Torvalds
33f82bda01 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
 "A sparse irq race/locking fix, and a MSI irq domains population fix"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Make sparse_irq_lock protect what it should protect
  genirq/msi: Fix populating multiple interrupts
2017-09-12 11:25:56 -07:00
Thomas Gleixner
12ac1d0f6c genirq: Make sparse_irq_lock protect what it should protect
for_each_active_irq() iterates the sparse irq allocation bitmap. The caller
must hold sparse_irq_lock. Several code pathes expect that an active bit in
the sparse bitmap also has a valid interrupt descriptor.

Unfortunately that's not true. The (de)allocation is a two step process,
which holds the sparse_irq_lock only across the queue/remove from the radix
tree and the set/clear in the allocation bitmap.

If a iteration locks sparse_irq_lock between the two steps, then it might
see an active bit but the corresponding irq descriptor is NULL. If that is
dereferenced unconditionally, then the kernel oopses. Of course, all
iterator sites could be audited and fixed, but....

There is no reason why the sparse_irq_lock needs to be dropped between the
two steps, in fact the code becomes simpler when the mutex is held across
both and the semantics become more straight forward, so future problems of
missing NULL pointer checks in the iteration are avoided and all existing
sites are fixed in one go.

Expand the lock held sections so both operations are covered and the bitmap
and the radixtree are in sync.

Fixes: a05a900a51 ("genirq: Make sparse_lock a mutex")
Reported-and-tested-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
2017-09-07 09:30:38 +02:00
John Keeping
596a7a1d09 genirq/msi: Fix populating multiple interrupts
On allocating the interrupts routed via a wire-to-MSI bridge, the allocator
iterates over the MSI descriptors to build the hierarchy, but fails to use
the descriptor interrupt number, and instead uses the base number,
generating the wrong IRQ domain mappings.

The fix is to use the MSI descriptor interrupt number when setting up
the interrupt instead of the base interrupt for the allocation range.

The only saving grace is that although the MSI descriptors are allocated
in bulk, the wired interrupts are only allocated one by one (so
desc->irq == virq) and the bug went unnoticed so far.

Fixes: 2145ac9310 ("genirq/msi: Add msi_domain_populate_irqs")
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170906103540.373864a2.john@metanate.com
2017-09-06 11:41:20 +02:00
Linus Torvalds
e7d0c41ecc Device properties framework updates for v4.14-rc1
- Introduce fwnode operations for all of the separate types of
    "firmware nodes" that can be handled by the device properties
    framework and drop the type field from struct fwnode_handle
    (Sakari Ailus, Arnd Bergmann).
 
  - Make the device properties framework use const fwnode arguments
    where possible (Sakari Ailus).
 
  - Add a helper for the consolidated handling of node references
    to the device properties framework (Sakari Ailus).
 
  - Switch over the ACPI part of the device properties framework
    to the new UUID API (Andy Shevchenko).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZrcHoAAoJEILEb/54YlRxVH4P/i7MVmWxZW1qosqt8NbI+kqu
 rjxBiQ1YaPuwWiZk5LMRQWIr4Y52v+8uwoVAoQbpfkpQpxpUtIApqFGGHkOK091S
 6wcwdAJv78m7dQGJZ96nQkBdw+qCUG+s9L3KMfXYiipwyG7bg4BVcs5jZcIqcZ4F
 2xecG6DMn4ESwFbZyVULWyQh50tSBztaHEG6AU2T/07yXU3RNJmwAVVZzpHdtA80
 mDbWcCFjcmhrpPa0Aq6MrSMjKso1zd8Es+xwYhXsIQpD1l0HhLLQ0X4veSPcPG4B
 aSNEYuribpvZ2FIRti7H7gi/F+Arm9vPdc9WHbOPLOIF1z+GJKiqjBuxUrfXKPqG
 v1W3f1bcApe9DfmC5z1wZBi2d7thQOzRFfc8WRrMybQ6z1MAqqe5PfAlgpMFmL22
 8ZCzzXIBUsfUjVlwYBvgkKvpLioEl88otWGdhewWY6F+DZ8+vPyvrpi15P36Xgos
 ijX89cvyfze3m5GW08hQ6DTOVvaFoMyucYfSo6/MBamw9fbUgiEgBfUAsQyb3sRU
 8g1KrwkAX8KFmoocX/AVjvwVBaKNdYeJ9Gy6EItAPxNl+F1q6vjkO0r/VeSrO1KW
 3GRqw5MZP35DD9IRo4DTAjwtNVkgIUjpG/hfB7l3PFdDxWfeiM5tf2zMExhT0nIR
 h8s8mn61KZp0gpsE02FS
 =0rnk
 -----END PGP SIGNATURE-----

Merge tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull device properties framework updates from Rafael Wysocki:
 "These introduce fwnode operations for all of the separate types of
  'firmware nodes' that can be handled by the device properties
  framework, make the framework use const fwnode arguments all over, add
  a helper for the consolidated handling of node references and switch
  over the framework to the new UUID API.

  Specifics:

   - Introduce fwnode operations for all of the separate types of
     'firmware nodes' that can be handled by the device properties
     framework and drop the type field from struct fwnode_handle (Sakari
     Ailus, Arnd Bergmann).

   - Make the device properties framework use const fwnode arguments
     where possible (Sakari Ailus).

   - Add a helper for the consolidated handling of node references to
     the device properties framework (Sakari Ailus).

   - Switch over the ACPI part of the device properties framework to the
     new UUID API (Andy Shevchenko)"

* tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: device property: Switch to use new generic UUID API
  device property: export irqchip_fwnode_ops
  device property: Introduce fwnode_property_get_reference_args
  device property: Constify fwnode property API
  device property: Constify argument to pset fwnode backend
  ACPI: Constify internal fwnode arguments
  ACPI: Constify acpi_bus helper functions, switch to macros
  ACPI: Prepare for constifying acpi_get_next_subnode() fwnode argument
  device property: Get rid of struct fwnode_handle type field
  ACPI: Use IS_ERR_OR_NULL() instead of non-NULL check in is_acpi_data_node()
2017-09-05 12:50:00 -07:00
Linus Torvalds
93cc1228b4 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The interrupt subsystem delivers this time:

   - Refactoring of the GIC-V3 driver to prepare for the GIC-V4 support

   - Initial GIC-V4 support

   - Consolidation of the FSL MSI support

   - Utilize the effective affinity interface in various ARM irqchip
     drivers

   - Yet another interrupt chip driver (UniPhier AIDET)

   - Bulk conversion of the irq chip driver to use %pOF

   - The usual small fixes and improvements all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  irqchip/ls-scfg-msi: Add MSI affinity support
  irqchip/ls-scfg-msi: Add LS1043a v1.1 MSI support
  irqchip/ls-scfg-msi: Add LS1046a MSI support
  arm64: dts: ls1046a: Add MSI dts node
  arm64: dts: ls1043a: Share all MSIs
  arm: dts: ls1021a: Share all MSIs
  arm64: dts: ls1043a: Fix typo of MSI compatible string
  arm: dts: ls1021a: Fix typo of MSI compatible string
  irqchip/ls-scfg-msi: Fix typo of MSI compatible strings
  irqchip/irq-bcm7120-l2: Use correct I/O accessors for irq_fwd_mask
  irqchip/mmp: Make mmp_intc_conf const
  irqchip/gic: Make irq_chip const
  irqchip/gic-v3: Advertise GICv4 support to KVM
  irqchip/gic-v4: Enable low-level GICv4 operations
  irqchip/gic-v4: Add some basic documentation
  irqchip/gic-v4: Add VLPI configuration interface
  irqchip/gic-v4: Add VPE command interface
  irqchip/gic-v4: Add per-VM VPE domain creation
  irqchip/gic-v3-its: Set implementation defined bit to enable VLPIs
  irqchip/gic-v3-its: Allow doorbell interrupts to be injected/cleared
  ...
2017-09-04 13:08:27 -07:00
Thomas Gleixner
9fbd7fd28d irqchip updates for 4.14
- irqchip-specific part of the monster GICv4 series
 - new UniPhier AIDET irqchip driver
 - new variants of some Freescale MSI widget
 - blanket removal of of_node->full_name in printk
 - random collection of fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlmoQxkVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDVUMQAIyE1q3fjSNZ+EkfK8+mbcWC80Wc
 suklgcqVbHahu6FHuHALlR7rgJIPSaFYFpDIwybA9A0Pwia/5Jf2mOL3RGVF4f97
 nyHlSS16kocZz8lKn+NtgcaUiFRma3y7GNek0pnsSlm+Vu+Syw3xssN+yYcGujTu
 jWRocvIqIJlScpzHG/Ulx3tZTXYfipQFfIQ3+9gm/i+KYqTwGDH/MsdxI7uAbctx
 YJGwLVtv4MGGmNHaq4iS64d55yrG/4Yqv+q92zFaaxj+V0di+Ds01+MDhdq8X7N/
 fhLGY/Yh/I3FiIIdIO/O1sj1EPO6lLbg4DPYXIMdjzwhBdKhu8i66/ttH/Kx//Aa
 1hhLZSN6rYiJM3lWcTxej45bs8MR/3MBm4gKpZxTgJ12YRIwgY8lRyoqXTlto5ls
 w10yi5wFsJaAO1E/HdEs/dyndV1jpvGo9KIRnfh7E5+Hw7PCYs9kZa4MUtq9RYT8
 Civyppi2sMfKYtGvwm+FS6sIigoFCh4DJ5MmUbM5CLh5imnggyYJlTsJdBuxVDZM
 1RoDnX/YebpVceezIZ/oCKq60Utck0Oqge2pc+NjVQupAp/x/13R/7DQPnFCq/OL
 Avx9kBtSzdYmYgE3EWt9n+h4LT23JpOym2OEUF3fhpPE96BKAJkMEPB/QlBi39fo
 0cZEX8M7xq5KjRJy
 =3ZS3
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates for 4.14 from Marc Zyngier:

- irqchip-specific part of the monster GICv4 series
- new UniPhier AIDET irqchip driver
- new variants of some Freescale MSI widget
- blanket removal of of_node->full_name in printk
- random collection of fixes
2017-08-31 20:12:51 +02:00
Thomas Gleixner
b33394ba5c genirq/proc: Avoid uninitalized variable warning
kernel/irq/proc.c: In function ‘show_irq_affinity’:
include/linux/cpumask.h:24:29: warning: ‘mask’ may be used uninitialized in this function [-Wmaybe-uninitialized]
 #define cpumask_bits(maskp) ((maskp)->bits)

gcc is silly, but admittedly it can't know that this won't be called with
anything else than the enumerated constants.

Shut up the warning by creating a default clause.

Fixes: 6bc6d4abd2 ("genirq/proc: Use the the accessor to report the effective affinity
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-08-25 22:40:26 +02:00
Dan Carpenter
20c4d49c0f irqdomain: Prevent potential NULL pointer dereference in irq_domain_push_irq()
This code generates a Smatch warning:

  kernel/irq/irqdomain.c:1511 irq_domain_push_irq()
  warn: variable dereferenced before check 'root_irq_data' (see line 1508)

irq_get_irq_data() can return a NULL pointer, but the code dereferences
the returned pointer before checking it.

Move the NULL pointer check before the dereference.

[ tglx: Rewrote changelog to be precise and conforming to the instructions
  	in submitting-patches and added a Fixes tag. Sigh! ]

Fixes: 495c38d300 ("irqdomain: Add irq_domain_{push,pop}_irq() functions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Daney <david.daney@cavium.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170825121409.6rfv4vt6ztz2oqkt@mwanda
2017-08-25 22:40:26 +02:00
kbuild test robot
ce8bdd6957 genirq: Fix semicolon.cocci warnings
kernel/irq/proc.c:69:2-3: Unneeded semicolon

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

Fixes: 0d3f54257d ("genirq: Introduce effective affinity mask")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kbuild-all@01.org
Link: http://lkml.kernel.org/r/20170822075053.GA93890@lkp-hsx02
2017-08-25 22:40:25 +02:00
Marc Zyngier
0abce64a55 genirq: Let irq_set_vcpu_affinity() iterate over hierarchy
When assigning an interrupt to a vcpu, it is not unlikely that
the level of the hierarchy implementing irq_set_vcpu_affinity
is not the top level (think a generic MSI domain on top of a
virtualization aware interrupt controller).

In such a case, let's iterate over the hierarchy until we find
an irqchip implementing it.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-08-23 11:09:14 +01:00
Alexey Dobriyan
8fbbe2d7cc genirq/ipi: Fixup checks against nr_cpu_ids
Valid CPU ids are [0, nr_cpu_ids-1] inclusive.

Fixes: 3b8e29a82d ("genirq: Implement ipi_send_mask/single()")
Fixes: f9bce791ae ("genirq: Add a new function to get IPI reverse mapping")
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170819095751.GB27864@avx2
2017-08-20 10:49:05 +02:00
Marc Zyngier
e8f241893d genirq: Restore trigger settings in irq_modify_status()
irq_modify_status starts by clearing the trigger settings from
irq_data before applying the new settings, but doesn't restore them,
leaving them to IRQ_TYPE_NONE.

That's pretty confusing to the potential request_irq() that could
follow. Instead, snapshot the settings before clearing them, and restore
them if the irq_modify_status() invocation was not changing the trigger.

Fixes: 1e2a7d7849 ("irqdomain: Don't set type when mapping an IRQ")
Reported-and-tested-by: jeffy <jeffy.chen@rock-chips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jon Hunter <jonathanh@nvidia.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170818095345.12378-1-marc.zyngier@arm.com
2017-08-18 12:04:14 +02:00
Thomas Gleixner
6629695465 Merge branch 'irq/for-gpio' into irq/core
Merge the flow handlers and irq domain extensions which are in a separate
branch so they can be consumed by the gpio folks.
2017-08-18 11:22:27 +02:00
David Daney
495c38d300 irqdomain: Add irq_domain_{push,pop}_irq() functions
For an already existing irqdomain hierarchy, as might be obtained via
a call to pci_enable_msix_range(), a PCI driver wishing to add an
additional irqdomain to the hierarchy needs to be able to insert the
irqdomain to that already initialized hierarchy.  Calling
irq_domain_create_hierarchy() allows the new irqdomain to be created,
but no existing code allows for initializing the associated irq_data.

Add a couple of helper functions (irq_domain_push_irq() and
irq_domain_pop_irq()) to initialize the irq_data for the new
irqdomain added to an existing hierarchy.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/1503017616-3252-6-git-send-email-david.daney@cavium.com
2017-08-18 11:21:42 +02:00
David Daney
0d12ec075a irqdomain: Check for NULL function pointer in irq_domain_free_irqs_hierarchy()
A follow-on patch will call irq_domain_free_irqs_hierarchy() when the
free() function pointer may be NULL.

Add a NULL pointer check to handle this new use case.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/1503017616-3252-5-git-send-email-david.daney@cavium.com
2017-08-18 11:21:42 +02:00
David Daney
b526adfe1b irqdomain: Factor out code to add and remove items to and from the revmap
The code to add and remove items to and from the revmap occurs several
times.

In preparation for the follow on patches that add more uses of this
code, factor this out in to separate static functions.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/1503017616-3252-4-git-send-email-david.daney@cavium.com
2017-08-18 11:21:41 +02:00
David Daney
7703b08cc9 genirq: Add handle_fasteoi_{level,edge}_irq flow handlers
Follow-on patch for gpio-thunderx uses a irqdomain hierarchy which
requires slightly different flow handlers, add them to chip.c which
contains most of the other flow handlers.  Make these conditionally
compiled based on CONFIG_IRQ_FASTEOI_HIERARCHY_HANDLERS.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/1503017616-3252-3-git-send-email-david.daney@cavium.com
2017-08-18 11:21:41 +02:00
David Daney
65efd9a49a genirq: Export more irq_chip_*_parent() functions
Many of the family of functions including irq_chip_mask_parent(),
irq_chip_unmask_parent() are exported, but not all.

Add EXPORT_SYMBOL_GPL to irq_chip_enable_parent,
irq_chip_disable_parent and irq_chip_set_affinity_parent, so they
likewise are usable from modules.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandre Courbot <gnurou@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/1503017616-3252-2-git-send-email-david.daney@cavium.com
2017-08-18 11:21:40 +02:00
Marc Zyngier
6bc6d4abd2 genirq/proc: Use the the accessor to report the effective affinity
If CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK is defined, but that the
interrupt is not single target, the effective affinity reported in
/proc/irq/x/effective_affinity will be empty, which is not the truth.

Instead, use the accessor to report the affinity, which will pick
the right mask.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Wei Xu <xuwei5@hisilicon.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Gregory Clement <gregory.clement@free-electrons.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Link: http://lkml.kernel.org/r/20170818083925.10108-3-marc.zyngier@arm.com
2017-08-18 10:54:39 +02:00
Marc Zyngier
536e2e34bd genirq/debugfs: Triggering of interrupts from userspace
When developing new (and therefore buggy) interrupt related
code, it can sometimes be useful to inject interrupts without
having to rely on a device to actually generate them.

This functionnality relies either on the irqchip driver to
expose a irq_set_irqchip_state(IRQCHIP_STATE_PENDING) callback,
or on the core code to be able to retrigger a (edge-only)
interrupt.

To use this feature:

echo -n trigger > /sys/kernel/debug/irq/irqs/IRQNUM

WARNING: This is DANGEROUS, and strictly a debug feature.
Do not use it on a production system. Your HW is likely to
catch fire, your data to be corrupted, and reporting this will
make you look an even bigger fool than the idiot who wrote
this patch.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170818081156.9264-1-marc.zyngier@arm.com
2017-08-18 10:36:24 +02:00
Thomas Gleixner
7cb2fad97e Merge branch 'irq/for-gpio' into irq/core
Merge the irq simulator which is in a separate branch so it can be consumed
by the gpio folks.
2017-08-16 16:41:28 +02:00
Bartosz Golaszewski
44e72c7ebf genirq/irq_sim: Add a devres variant of irq_sim_init()
Add a resource managed version of irq_sim_init(). This can be
conveniently used in device drivers.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Link: http://lkml.kernel.org/r/20170814145318.6495-3-brgl@bgdev.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-08-16 16:40:02 +02:00
Bartosz Golaszewski
b19af510e6 genirq/irq_sim: Add a simple interrupt simulator framework
Implement a simple, irq_work-based framework for simulating
interrupts. Currently the API exposes routines for initializing and
deinitializing the simulator object, enqueueing the interrupts and
retrieving the allocated interrupt numbers based on the offset of the
dummy interrupt in the simulator struct.

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Link: http://lkml.kernel.org/r/20170814145318.6495-2-brgl@bgdev.pl
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-08-16 16:40:02 +02:00
Masahiro Yamada
163616cf2f genirq: Fix for_each_action_of_desc() macro
struct irq_desc does not have a member named "act".  The correct
name is "action".

Currently, all users of this macro use an iterator named "action".
If a different name is used, it will cause a build error.

Fixes: f944b5a7af ("genirq: Use a common macro to go through the actions list")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1502260341-28184-1-git-send-email-yamada.masahiro@socionext.com
2017-08-14 12:10:37 +02:00
Thomas Gleixner
8397913303 genirq/cpuhotplug: Revert "Set force affinity flag on hotplug migration"
That commit was part of the changes moving x86 to the generic CPU hotplug
interrupt migration code. The force flag was required on x86 before the
hierarchical irqdomain rework, but invoking set_affinity() with force=true
stayed and had no side effects.

At some point in the past, the force flag got repurposed to support the
exynos timer interrupt affinity setting to a not yet online CPU, so the
interrupt controller callback does not verify the supplied affinity mask
against cpu_online_mask.

Setting the flag in the CPU hotplug code causes the cpu online masking to
be blocked on these irq controllers and results in potentially affining an
interrupt to the CPU which is unplugged, i.e. instead of moving it away,
it's just reassigned to it.

As the force flags is not longer needed on x86, it's safe to revert that
patch so the ARM irqchips which use the force flag work again.

Add comments to that effect, so this won't happen again.

Note: The online mask handling should be done in the generic code and the
force flag and the masking in the irq chips removed all together, but
that's not a change possible for 4.13. 

Fixes: 77f85e66aa ("genirq/cpuhotplug: Set force affinity flag on hotplug migration")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707271217590.3109@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-07-27 15:40:02 +02:00
Arnd Bergmann
b6eb66fd34 device property: export irqchip_fwnode_ops
The newly added irqchip_fwnode_ops structure is not exported, which can
lead to link errors:

ERROR: "irqchip_fwnode_ops" [drivers/gpio/gpio-xgene-sb.ko] undefined!

I checked that all other such symbols that were introduced are
exported if they need to be, this is the only missing one.

Fixes: db3e50f323 (device property: Get rid of struct fwnode_handle type field)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-26 02:19:35 +02:00
Sakari Ailus
db3e50f323 device property: Get rid of struct fwnode_handle type field
Instead of relying on the struct fwnode_handle type field, define
fwnode_operations structs for all separate types of fwnodes. To find out
the type, compare to the ops field to relevant ops structs.

This change has two benefits:

1. it avoids adding the type field to each and every instance of struct
fwnode_handle, thus saving memory and

2. makes the ops field the single factor that defines both the types of
the fwnode as well as defines the implementation of its operations,
decreasing the possibility of bugs when developing code dealing with
fwnode internals.

Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-22 00:04:50 +02:00
Juergen Gross
a696712c3d genirq/PM: Properly pretend disabled state when force resuming interrupts
Interrupts with the IRQF_FORCE_RESUME flag set have also the
IRQF_NO_SUSPEND flag set. They are not disabled in the suspend path, but
must be forcefully resumed. That's used by XEN to keep IPIs enabled beyond
the suspension of device irqs. Force resume works by pretending that the
interrupt was disabled and then calling __irq_enable().

Incrementing the disabled depth counter was enough to do that, but with the
recent changes which use state flags to avoid unnecessary hardware access,
this is not longer sufficient. If the state flags are not set, then the
hardware callbacks are not invoked and the interrupt line stays disabled in
"hardware".

Set the disabled and masked state when pretending that an interrupt got
disabled by suspend.

Fixes: bf22ff45be ("genirq: Avoid unnecessary low level irq function calls")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/20170717174703.4603-2-jgross@suse.com
2017-07-17 22:32:20 +02:00
Linus Torvalds
935acd3f5e Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
 "Fix the fallout from reworking the locking and resource management in
  request/free_irq()"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Keep chip buslock across irq_request/release_resources()
2017-07-17 13:00:36 -07:00
Thomas Gleixner
19d39a3810 genirq: Keep chip buslock across irq_request/release_resources()
Moving the irq_request/release_resources() callbacks out of the spinlocked,
irq disabled and bus locked region, unearthed an interesting abuse of the
irq_bus_lock/irq_bus_sync_unlock() callbacks.

The OMAP GPIO driver does merily power management inside of them. The
irq_request_resources() callback of this GPIO irqchip calls a function
which reads a GPIO register. That read aborts now because the clock of the
GPIO block is not magically enabled via the irq_bus_lock() callback.

Move the callbacks under the bus lock again to prevent this. In the
free_irq() path this requires to drop the bus_lock before calling
synchronize_irq() and reaquiring it before calling the
irq_release_resources() callback.

The bus lock can't be held because:

   1) The data which has been changed between bus_lock/un_lock is cached in
      the irq chip driver private data and needs to go out to the irq chip
      via the slow bus (usually SPI or I2C) before calling
      synchronize_irq().

      That's the reason why this bus_lock/unlock magic exists in the first
      place, as you cannot do SPI/I2C transactions while holding desc->lock
      with interrupts disabled.

   2) synchronize_irq() will actually deadlock, if there is a handler on
      flight. These chips use threaded handlers for obvious reasons, as
      they allow to do SPI/I2C communication. When the threaded handler
      returns then bus_lock needs to be taken in irq_finalize_oneshot() as
      we need to talk to the actual irq chip once more. After that the
      threaded handler is marked done, which makes synchronize_irq() return.

      So if we hold bus_lock accross the synchronize_irq() call, the
      handler cannot mark itself done because it blocks on the bus
      lock. That in turn makes synchronize_irq() wait forever on the
      threaded handler to complete....

Add the missing unlock of desc->request_mutex in the error path of
__free_irq() and add a bunch of comments to explain the locking and
protection rules.

Fixes: 46e48e2573 ("genirq: Move irq resource handling out of spinlocked region")
Reported-and-tested-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Not-longer-ranted-at-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
2017-07-12 10:14:42 +02:00
Linus Torvalds
7cb328c30a Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:

 - A few fixes mopping up the fallout of the big irq overhaul

 - Move the interrupt resource management logic out of the spin locked,
   irq disabled region to avoid unnecessary restrictions of the resource
   callbacks

 - Preparation for reworking the per cpu irq request function.

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Allow ACPI device nodes to be used as irqdomain identifiers
  genirq/debugfs: Remove redundant NULL pointer check
  genirq: Allow to pass the IRQF_TIMER flag with percpu irq request
  genirq/timings: Move free timings out of spinlocked region
  genirq: Move irq resource handling out of spinlocked region
  genirq: Add mutex to irq desc to serialize request/free_irq()
  genirq: Move bus locking into __setup_irq()
  genirq: Force inlining of __irq_startup_managed to prevent build failure
  genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN
2017-07-09 10:24:46 -07:00
Linus Torvalds
f263fbb8d6 pci-v4.13-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZYAFUAAoJEFmIoMA60/r8cFQP/A4fpdjhd42WRNQXGTpZieop
 i40lBQtGdBn/UY97U6BoutcS1ygDi9OiSzg+IR6I90iMgidqyUHFhe4hGWgVHD2g
 Tg0KLzd+lKKfQ6Gqt1P6t4dLGLvyEj5NUbCeFE4XYODAUkkiBaOndax6DK1GvU54
 Vjuj63rHtMKFR/tG/4iFTigObqyI8QE6O9JVxwuvIyEX6RXKbJe+wkulv5taSnWt
 Ne94950i10MrELtNreVdi8UbCbXiqjg0r5sKI/WTJ7Bc7WsC7X5PhWlhcNrbHyBT
 Ivhoypkui3Ky8gvwWqL0KBG+cRp8prBXAdabrD9wRbz0TKnfGI6pQzseCGRnkE6T
 mhlSJpsSNIHaejoCjk93yPn5oRiTNtPMdVhMpEQL9V/crVRGRRmbd7v2TYvpMHVR
 JaPZ8bv+C2aBTY8uL3/v/rgrjsMKOYFeaxeNklpErxrknsbgb6BgubmeZXDvTBVv
 YUIbAkvveonUKisv+kbD8L7tp1+jdbRUT0AikS0NVgAJQhfArOmBcDpTL9YC51vE
 feFhkVx4A32vvOm7Zcg9A7IMXNjeSfccKGw3dJOAvzgDODuJiaCG6S0o7B5Yngze
 axMi87ixGT4QM98z/I4MC8E9rDrJdIitlpvb6ZBgiLzoO3kmvsIZZKt8UxWqf5r8
 w3U2HoyKH13Qbkn1xkum
 =mkyb
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

  - add sysfs max_link_speed/width, current_link_speed/width (Wong Vee
    Khee)

  - make host bridge IRQ mapping much more generic (Matthew Minter,
    Lorenzo Pieralisi)

  - convert most drivers to pci_scan_root_bus_bridge() (Lorenzo
    Pieralisi)

  - mutex sriov_configure() (Jakub Kicinski)

  - mutex pci_error_handlers callbacks (Christoph Hellwig)

  - split ->reset_notify() into ->reset_prepare()/reset_done()
    (Christoph Hellwig)

  - support multiple PCIe portdrv interrupts for MSI as well as MSI-X
    (Gabriele Paoloni)

  - allocate MSI/MSI-X vector for Downstream Port Containment (Gabriele
    Paoloni)

  - fix MSI IRQ affinity pre/post/min_vecs issue (Michael Hernandez)

  - test INTx masking during enumeration, not at run-time (Piotr Gregor)

  - avoid using device_may_wakeup() for runtime PM (Rafael J. Wysocki)

  - restore the status of PCI devices across hibernation (Chen Yu)

  - keep parent resources that start at 0x0 (Ard Biesheuvel)

  - enable ECRC only if device supports it (Bjorn Helgaas)

  - restore PRI and PASID state after Function-Level Reset (CQ Tang)

  - skip DPC event if device is not present (Keith Busch)

  - check domain when matching SMBIOS info (Sujith Pandel)

  - mark Intel XXV710 NIC INTx masking as broken (Alex Williamson)

  - avoid AMD SB7xx EHCI USB wakeup defect (Kai-Heng Feng)

  - work around long-standing Macbook Pro poweroff issue (Bjorn Helgaas)

  - add Switchtec "running" status flag (Logan Gunthorpe)

  - fix dra7xx incorrect RW1C IRQ register usage (Arvind Yadav)

  - modify xilinx-nwl IRQ chip for legacy interrupts (Bharat Kumar
    Gogada)

  - move VMD SRCU cleanup after bus, child device removal (Jon Derrick)

  - add Faraday clock handling (Linus Walleij)

  - configure Rockchip MPS and reorganize (Shawn Lin)

  - limit Qualcomm TLP size to 2K (hardware issue) (Srinivas Kandagatla)

  - support Tegra MSI 64-bit addressing (Thierry Reding)

  - use Rockchip normal (not privileged) register bank (Shawn Lin)

  - add HiSilicon Kirin SoC PCIe controller driver (Xiaowei Song)

  - add Sigma Designs Tango SMP8759 PCIe controller driver (Marc
    Gonzalez)

  - add MediaTek PCIe host controller support (Ryder Lee)

  - add Qualcomm IPQ4019 support (John Crispin)

  - add HyperV vPCI protocol v1.2 support (Jork Loeser)

  - add i.MX6 regulator support (Quentin Schulz)

* tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
  PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support
  PCI: Add DT binding for Sigma Designs Tango PCIe controller
  PCI: rockchip: Use normal register bank for config accessors
  dt-bindings: PCI: Add documentation for MediaTek PCIe
  PCI: Remove __pci_dev_reset() and pci_dev_reset()
  PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done()
  PCI: xilinx: Make of_device_ids const
  PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts
  PCI: vmd: Move SRCU cleanup after bus, child device removal
  PCI: vmd: Correct comment: VMD domains start at 0x10000, not 0x1000
  PCI: versatile: Add local struct device pointers
  PCI: tegra: Do not allocate MSI target memory
  PCI: tegra: Support MSI 64-bit addressing
  PCI: rockchip: Use local struct device pointer consistently
  PCI: rockchip: Check for clk_prepare_enable() errors during resume
  MAINTAINERS: Remove Wenrui Li as Rockchip PCIe driver maintainer
  PCI: rockchip: Configure RC's MPS setting
  PCI: rockchip: Reconfigure configuration space header type
  PCI: rockchip: Split out rockchip_pcie_cfg_configuration_accesses()
  PCI: rockchip: Move configuration accesses into rockchip_pcie_cfg_atu()
  ...
2017-07-08 15:51:57 -07:00
Marc Zyngier
c5c601c429 irqdomain: Allow ACPI device nodes to be used as irqdomain identifiers
A number of irqchip implementations are (ab)using the irqdomain allocator
by passing a fwnode that is neither a FWNODE_OF or a FWNODE_IRQCHIP.

This is pretty bad, but it also feels pretty crap to force these drivers to
allocate their own irqchip_fwid when they already have a proper fwnode.

Instead, let's teach the irqdomain allocator about ACPI device nodes, and
add some lovely name generation code... Tested on an arm64 D05 system.

Reported-and-tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ma Jun <majun258@huawei.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Link: http://lkml.kernel.org/r/20170707083959.10349-1-marc.zyngier@arm.com
2017-07-07 12:13:29 +02:00
Thomas Gleixner
f610c9d68b genirq/debugfs: Remove redundant NULL pointer check
debugfs_remove() can be called with a NULL pointer.

Fixes: 087cdfb662 ("genirq/debugfs: Add proper debugfs interface")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-07-07 08:57:57 +02:00
Daniel Lezcano
c80081b920 genirq: Allow to pass the IRQF_TIMER flag with percpu irq request
The irq timings infrastructure tracks when interrupts occur in order to
statistically predict te next interrupt event.

There is no point to track timer interrupts and try to predict them because
the next expiration time is already known. This can be avoided via the
IRQF_TIMER flag which is passed by timer drivers in request_irq(). It marks
the interrupt as timer based which alloes to ignore these interrupts in the
timings code.

Per CPU interrupts which are requested via request_percpu_+irq() have no
flag argument, so marking per cpu timer interrupts is not possible and they
get tracked pointlessly.

Add __request_percpu_irq() as a variant of request_percpu_irq() with a
flags argument and make request_percpu_irq() an inline wrapper passing
flags = 0.

The flag parameter is restricted to IRQF_TIMER as all other IRQF_ flags
make no sense for per cpu interrupts.

The next step is to convert all existing users of request_percpu_irq() and
then remove the wrapper and the underscores.

[ tglx: Massaged changelog ]

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: peterz@infradead.org
Cc: nicolas.pitre@linaro.org
Cc: vincent.guittot@linaro.org
Cc: rafael@kernel.org
Link: http://lkml.kernel.org/r/1499344144-3964-1-git-send-email-daniel.lezcano@linaro.org
2017-07-06 23:16:22 +02:00
Thomas Gleixner
2343877fbd genirq/timings: Move free timings out of spinlocked region
No point to do memory management from a interrupt disabled spin locked
region.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.196130646@linutronix.de
2017-07-04 12:46:16 +02:00
Thomas Gleixner
46e48e2573 genirq: Move irq resource handling out of spinlocked region
Aside of being conceptually wrong, there is also an actual (hard to
trigger and mostly theoretical) problem.

CPU0				CPU1
free_irq(X)			interrupt X
				spin_lock(desc->lock)
				wake irq thread()
				spin_unlock(desc->lock)
spin_lock(desc->lock)
remove action()
shutdown_irq()			
release_resources()		thread_handler()
spin_unlock(desc->lock)		  access released resources.

synchronize_irq()

Move the release resources invocation after synchronize_irq() so it's
guaranteed that the threaded handler has finished.

Move the resource request call out of the desc->lock held region as well,
so the invocation context is the same for both request and release.

This solves the problems with those functions on RT as well.
 
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.117028181@linutronix.de
2017-07-04 12:46:16 +02:00
Thomas Gleixner
9114014cf4 genirq: Add mutex to irq desc to serialize request/free_irq()
The irq_request/release_resources() callbacks ar currently invoked under
desc->lock with interrupts disabled. This is a source of problems on RT and
conceptually not required.

Add a seperate mutex to struct irq_desc which allows to serialize
request/free_irq(), which can be used to move the resource functions out of
the desc->lock held region.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.039220922@linutronix.de
2017-07-04 12:46:16 +02:00
Thomas Gleixner
3a90795e1e genirq: Move bus locking into __setup_irq()
There is no point in having the irq_bus_lock() protection around all
callers to __setup_irq().

Move it into __setup_irq(). This is also a preparatory patch for addressing
the issues with the irq resource callbacks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214343.960949031@linutronix.de
2017-07-04 12:46:15 +02:00
Geert Uytterhoeven
2372a519f6 genirq: Force inlining of __irq_startup_managed to prevent build failure
If CONFIG_SMP=n, and gcc (e.g. 4.1.2) decides not to inline
__irq_startup_managed(), the build fails with:

    kernel/built-in.o: In function `irq_startup':
    (.text+0x38ed8): undefined reference to `irq_set_affinity_locked'

Fix this by forcing inlining of __irq_startup_managed().

Fixes: 761ea388e8 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1499162761-12398-1-git-send-email-geert@linux-m68k.org
2017-07-04 12:36:44 +02:00
Sebastian Ott
e5682b4eec genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN
Fix this build error:

kernel/irq/internals.h:440:20: error: inlining failed in call to always_inline
  'irq_domain_debugfs_init': function body not available
kernel/irq/debugfs.c:202:2: note: called from here
  irq_domain_debugfs_init(root_dir);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1707041124000.1712@schleppi
2017-07-04 12:36:43 +02:00
Linus Torvalds
650fc870a2 There has been a fair amount of activity in the docs tree this time
around.  Highlights include:
 
  - Conversion of a bunch of security documentation into RST
 
  - The conversion of the remaining DocBook templates by The Amazing
    Mauro Machine.  We can now drop the entire DocBook build chain.
 
  - The usual collection of fixes and minor updates.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZWkGAAAoJEI3ONVYwIuV6rf0P/0B3JTiVPKS/WUx53+jzbAi4
 1BN7dmmuMxE1bWpgdEq+ac4aKxm07iAojuntuMj0qz/ZB1WARcmvEqqzI5i4wfq9
 5MrLduLkyuWfr4MOPseKJ2VK83p8nkMOiO7jmnBsilu7fE4nF+5YY9j4cVaArfMy
 cCQvAGjQzvej2eiWMGUSLHn4QFKh00aD7cwKyBVsJ08b27C9xL0J2LQyCDZ4yDgf
 37/MH3puEd3HX/4qAwLonIxT3xrIrrbDturqLU7OSKcWTtGZNrYyTFbwR3RQtqWd
 H8YZVg2Uyhzg9MYhkbQ2E5dEjUP4mkegcp6/JTINH++OOPpTbdTJgirTx7VTkSf1
 +kL8t7+Ayxd0FH3+77GJ5RMj8LUK6rj5cZfU5nClFQKWXP9UL3IelQ3Nl+SpdM8v
 ZAbR2KjKgH9KS6+cbIhgFYlvY+JgPkOVruwbIAc7wXVM3ibk1sWoBOFEujcbueWh
 yDpQv3l1UX0CKr3jnevJoW26LtEbGFtC7gSKZ+3btyeSBpWFGlii42KNycEGwUW0
 ezlwryDVHzyTUiKllNmkdK4v73mvPsZHEjgmme4afKAIiUilmcUF4XcqD86hISFT
 t+UJLA/zEU+0sJe26o2nK6GNJzmo4oCtVyxfhRe26Ojs1n80xlYgnZRfuIYdd31Z
 nwLBnwDCHAOyX91WXp9G
 =cVjZ
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.13' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "There has been a fair amount of activity in the docs tree this time
  around. Highlights include:

   - Conversion of a bunch of security documentation into RST

   - The conversion of the remaining DocBook templates by The Amazing
     Mauro Machine. We can now drop the entire DocBook build chain.

   - The usual collection of fixes and minor updates"

* tag 'docs-4.13' of git://git.lwn.net/linux: (90 commits)
  scripts/kernel-doc: handle DECLARE_HASHTABLE
  Documentation: atomic_ops.txt is core-api/atomic_ops.rst
  Docs: clean up some DocBook loose ends
  Make the main documentation title less Geocities
  Docs: Use kernel-figure in vidioc-g-selection.rst
  Docs: fix table problems in ras.rst
  Docs: Fix breakage with Sphinx 1.5 and upper
  Docs: Include the Latex "ifthen" package
  doc/kokr/howto: Only send regression fixes after -rc1
  docs-rst: fix broken links to dynamic-debug-howto in kernel-parameters
  doc: Document suitability of IBM Verse for kernel development
  Doc: fix a markup error in coding-style.rst
  docs: driver-api: i2c: remove some outdated information
  Documentation: DMA API: fix a typo in a function name
  Docs: Insert missing space to separate link from text
  doc/ko_KR/memory-barriers: Update control-dependencies example
  Documentation, kbuild: fix typo "minimun" -> "minimum"
  docs: Fix some formatting issues in request-key.rst
  doc: ReSTify keys-trusted-encrypted.txt
  doc: ReSTify keys-request-key.txt
  ...
2017-07-03 21:13:25 -07:00
Jeffy Chen
bf22ff45be genirq: Avoid unnecessary low level irq function calls
Check irq state in enable/disable/unmask/mask_irq to avoid unnecessary
low level irq function calls.

This has two advantages:
    - Conditionals are faster than hardware access

    - Solves issues with the underlying refcounting of the pinctrl
      infrastructure

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tfiga@chromium.org
Cc: briannorris@chromium.org
Cc: dianders@chromium.org
Link: http://lkml.kernel.org/r/1498476814-12563-2-git-send-email-jeffy.chen@rock-chips.com
2017-06-26 15:47:00 +02:00
Jeffy Chen
d829b8fb24 genirq: Set irq masked state when initializing irq_desc
The irq default state is set to disabled when allocating irq desc, but the
masked state flag is not set. This is inconsistent vs. the state tracking
logic which is used to prevent unnecessary calls to hardware level irq chip
functions.

Set the masked state flag as well.

Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tfiga@chromium.org
Cc: briannorris@chromium.org
Cc: dianders@chromium.org
Link: http://lkml.kernel.org/r/1498476814-12563-1-git-send-email-jeffy.chen@rock-chips.com
2017-06-26 14:05:41 +02:00
Daniel Lezcano
e1c9214955 genirq/timings: Add infrastructure for estimating the next interrupt arrival time
An interrupt behaves with a burst of activity with periodic interval of time
followed by one or two peaks of longer interval.

As the time intervals are periodic, statistically speaking they follow a normal
distribution and each interrupts can be tracked individually.

Add a mechanism to compute the statistics on all interrupts, except the
timers which are deterministic from a prediction point of view, as their
expiry time is known.

The goal is to extract the periodicity for each interrupt, with the last
timestamp and sum them, so the next event can be predicted to a certain
extent.

Taking the earliest prediction gives the expected wakeup on the system
(assuming a timer won't expire before).

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1498227072-5980-2-git-send-email-daniel.lezcano@linaro.org
2017-06-24 11:44:39 +02:00
Daniel Lezcano
b2d3d61adb genirq/timings: Add infrastructure to track the interrupt timings
The interrupt framework gives a lot of information about each interrupt. It
does not keep track of when those interrupts occur though, which is a
prerequisite for estimating the next interrupt arrival for power management
purposes.

Add a mechanism to record the timestamp for each interrupt occurrences in a
per-CPU circular buffer to help with the prediction of the next occurrence
using a statistical model.

Each CPU can store up to IRQ_TIMINGS_SIZE events <irq, timestamp>, the
current value of IRQ_TIMINGS_SIZE is 32.

Each event is encoded into a single u64, where the high 48 bits are used
for the timestamp and the low 16 bits are for the irq number.

A static key is introduced so when the irq prediction is switched off at
runtime, the overhead is near to zero.

It results in most of the code in internals.h for inline reasons and a very
few in the new file timings.c. The latter will contain more in the next patch
which will provide the statistical model for the next event prediction.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1498227072-5980-1-git-send-email-daniel.lezcano@linaro.org
2017-06-24 11:44:11 +02:00
Thomas Gleixner
c2ce34c0a0 genirq/debugfs: Remove pointless NULL pointer check
debugfs_remove() has it's own NULL pointer check. Remove the conditional
and make irq_remove_debugfs_entry() an inline helper

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-24 11:43:53 +02:00
Marc Zyngier
6a6544e520 genirq/irqdomain: Remove auto-recursive hierarchy support
It did seem like a good idea at the time, but it never really
caught on, and auto-recursive domains remain unused 3 years after
having been introduced.

Oh well, time for a late spring cleanup.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-22 18:29:34 +02:00
Marc Zyngier
61d0a000b7 genirq/irqdomain: Add irq_domain_update_bus_token helper
We can have irq domains that are identified by the same fwnode
(because they are serviced by the same HW), and yet have different
functionnality (because they serve different busses, for example).
This is what we use the bus_token field.

Since we don't use this field when generating the domain name,
all the aliasing domains will get the same name, and the debugfs
file creation fails. Also, bus_token is updated by individual drivers,
and the core code is unaware of that update.

In order to sort this mess, let's introduce a helper that takes care
of updating bus_token, and regenerate the debugfs file.

A separate patch will update all the individual users.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-22 18:28:45 +02:00
Christoph Hellwig
9a0ef98e18 genirq/affinity: Assign vectors to all present CPUs
Currently the irq vector spread algorithm is restricted to online CPUs,
which ties the IRQ mapping to the currently online devices and doesn't deal
nicely with the fact that CPUs could come and go rapidly due to e.g. power
management.

Instead assign vectors to all present CPUs to avoid this churn.

Build a map of all possible CPUs for a given node, as the architectures
only provide a map of all onlines CPUs. Do this dynamically on each call
for the vector assingments, which is a bit suboptimal and could be
optimized in the future by provinding a mapping from the arch code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-nvme@lists.infradead.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170603140403.27379-5-hch@lst.de
2017-06-22 18:21:26 +02:00
Thomas Gleixner
8f31a9845d genirq/cpuhotplug: Avoid irq affinity setting for single targets
Avoid trying to add a newly online CPU to the effective affinity mask of an
started up interrupt. That interrupt will either stay on the already online
CPU or move around for no value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.431321047@linutronix.de
2017-06-22 18:21:25 +02:00
Thomas Gleixner
d52dd44175 genirq: Introduce IRQD_SINGLE_TARGET flag
Many interrupt chips allow only a single CPU as interrupt target. The core
code has no knowledge about that. That's unfortunate as it could avoid
trying to readd a newly online CPU to the effective affinity mask.

Add the status flag and the necessary accessors.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.352343969@linutronix.de
2017-06-22 18:21:25 +02:00
Thomas Gleixner
c5cb83bb33 genirq/cpuhotplug: Handle managed IRQs on CPU hotplug
If a CPU goes offline, interrupts affine to the CPU are moved away. If the
outgoing CPU is the last CPU in the affinity mask the migration code breaks
the affinity and sets it it all online cpus.

This is a problem for affinity managed interrupts as CPU hotplug is often
used for power management purposes. If the affinity is broken, the
interrupt is not longer affine to the CPUs to which it was allocated.

The affinity spreading allows to lay out multi queue devices in a way that
they are assigned to a single CPU or a group of CPUs. If the last CPU goes
offline, then the queue is not longer used, so the interrupt can be
shutdown gracefully and parked until one of the assigned CPUs comes online
again.

Add a graceful shutdown mechanism into the irq affinity breaking code path,
mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified.

In the online path, scan the active interrupts for managed interrupts and
if the interrupt is functional and the newly online CPU is part of the
affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if
the interrupts is started up, try to add the CPU back to the effective
affinity mask.

Originally-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de
2017-06-22 18:21:25 +02:00
Thomas Gleixner
761ea388e8 genirq: Handle managed irqs gracefully in irq_startup()
Affinity managed interrupts should keep their assigned affinity accross CPU
hotplug. To avoid magic hackery in device drivers, the core code shall
manage them transparently and set these interrupts into a managed shutdown
state when the last CPU of the assigned affinity mask goes offline. The
interrupt will be restarted when one of the CPUs in the assigned affinity
mask comes back online.

Add the necessary logic to irq_startup(). If an interrupt is requested and
started up, the code checks whether it is affinity managed and if so, it
checks whether a CPU in the interrupts affinity mask is online. If not, it
puts the interrupt into managed shutdown state. 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.189851170@linutronix.de
2017-06-22 18:21:24 +02:00
Thomas Gleixner
4cde9c6b82 genirq: Add force argument to irq_startup()
In order to handle managed interrupts gracefully on irq_startup() so they
won't lose their assigned affinity, it's necessary to allow startups which
keep the interrupts in managed shutdown state, if none of the assigend CPUs
is online. This allows drivers to request interrupts w/o the CPUs being
online, which avoid online/offline churn in drivers.

Add a force argument which can override that decision and let only
request_irq() and enable_irq() allow the managed shutdown
handling. enable_irq() is required, because the interrupt might be
requested with IRQF_NOAUTOEN and enable_irq() invokes irq_startup() which
would then wreckage the assignment again. All other callers force startup
and potentially break the assigned affinity.

No functional change as this only adds the function argument.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.112094565@linutronix.de
2017-06-22 18:21:24 +02:00
Thomas Gleixner
708d174b6c genirq: Split out irq_startup() code
Split out the inner workings of irq_startup() so it can be reused to handle
managed interrupts gracefully.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.033235144@linutronix.de
2017-06-22 18:21:24 +02:00
Thomas Gleixner
54fdf6a087 genirq: Introduce IRQD_MANAGED_SHUTDOWN
Affinity managed interrupts should keep their assigned affinity accross CPU
hotplug. To avoid magic hackery in device drivers, the core code shall
manage them transparently. This will set these interrupts into a managed
shutdown state when the last CPU of the assigned affinity mask goes
offline. The interrupt will be restarted when one of the CPUs in the
assigned affinity mask comes back online.

Introduce the necessary state flag and the accessor functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.954523476@linutronix.de
2017-06-22 18:21:23 +02:00
Thomas Gleixner
415fcf1a22 genirq/cpuhotplug: Use effective affinity mask
If the architecture supports the effective affinity mask, migrating
interrupts away which are not targeted by the effective mask is
pointless.

They can stay in the user or system supplied affinity mask, but won't be
targetted at any given point as the affinity setter functions need to
validate against the online cpu mask anyway.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.328488490@linutronix.de
2017-06-22 18:21:21 +02:00
Thomas Gleixner
0d3f54257d genirq: Introduce effective affinity mask
There is currently no way to evaluate the effective affinity mask of a
given interrupt. Many irq chips allow only a single target CPU or a subset
of CPUs in the affinity mask.

Updating the mask at the time of setting the affinity to the subset would
be counterproductive because information for cpu hotplug about assigned
interrupt affinities gets lost. On CPU hotplug it's also pointless to force
migrate an interrupt, which is not targeted at the CPU effectively. But
currently the information is not available.

Provide a seperate mask to be updated by the irq_chip->irq_set_affinity()
implementations. Implement the read only proc files so the user can see the
effective mask as well w/o trying to deduce it from /proc/interrupts.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.247834245@linutronix.de
2017-06-22 18:21:20 +02:00
Thomas Gleixner
c1a8038696 genirq/proc: Replace ever repeating type cast
The proc file setup repeats the same ugly type cast for the irq number over
and over. Do it once and hand in the local void pointer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.160866358@linutronix.de
2017-06-22 18:21:20 +02:00
Thomas Gleixner
4ab764c336 genirq: Remove pointless gfp argument
All callers hand in GPF_KERNEL. No point to have an extra argument for
that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.082544752@linutronix.de
2017-06-22 18:21:19 +02:00
Thomas Gleixner
047dc6331d genirq: Remove pointless arg from show_irq_affinity
The third argument of the internal helper function is unused. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.004958600@linutronix.de
2017-06-22 18:21:19 +02:00
Thomas Gleixner
36d84fb451 genirq: Move irq_fixup_move_pending() to core
Now that x86 uses the generic code, the function declaration and inline
stub can move to the core internal header.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.928156166@linutronix.de
2017-06-22 18:21:19 +02:00
Thomas Gleixner
77f85e66aa genirq/cpuhotplug: Set force affinity flag on hotplug migration
Set the force migration flag when migrating interrupts away from an
outgoing CPU.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.681874648@linutronix.de
2017-06-22 18:21:18 +02:00
Thomas Gleixner
47a06d3a78 genirq/cpuhotplug: Add support for conditional masking
Interrupts which cannot be migrated in process context, need to be masked
before the affinity is changed forcefully.

Add support for that. Will be compiled out for architectures which do not
have this x86 specific issue.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.604565591@linutronix.de
2017-06-22 18:21:17 +02:00
Thomas Gleixner
f0383c24b4 genirq/cpuhotplug: Add support for cleaning up move in progress
In order to move x86 to the generic hotplug migration code, add support for
cleaning up move in progress bits.

On architectures which have this x86 specific (mis)feature not enabled,
this is optimized out by the compiler.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.525817311@linutronix.de
2017-06-22 18:21:17 +02:00
Thomas Gleixner
91f26cb4cd genirq/cpuhotplug: Do not migrated shutdown irqs
Interrupts, which are shut down are tried to be migrated as well. That's
pointless because the interrupt cannot fire and the next startup will move
it to the proper place anyway.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.447550992@linutronix.de
2017-06-22 18:21:17 +02:00
Thomas Gleixner
e8a7035039 genirq/cpuhotplug: Reorder check logic
Move the checks for a valid irq chip and the irq_set_affinity() callback
right in front of the whole migration logic. No point in doing a gazillion
of other things when the interrupt cannot be migrated at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.354181630@linutronix.de
2017-06-22 18:21:16 +02:00
Thomas Gleixner
735c09524d genirq/cpuhotplug: Dont claim success on error
In case the affinity of an interrupt was broken, a printk is emitted.

But if the affinity cannot be set at all due to a missing
irq_set_affinity() callback or due to a failing callback, the message is
still printed preceeded by a warning/error.

That makes no sense whatsoever.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.274852976@linutronix.de
2017-06-22 18:21:16 +02:00
Thomas Gleixner
0dd945ff46 genirq/cpuhotplug: Remove irq disabling logic
This is called from stop_machine() with interrupts disabled. No point in
disabling them some more.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.198042748@linutronix.de
2017-06-22 18:21:16 +02:00
Christoph Hellwig
137221df69 genirq: Move pending helpers to internal.h
So that the affinity code can reuse them.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619235445.109426284@linutronix.de
2017-06-22 18:21:15 +02:00
Thomas Gleixner
2e051552df genirq: Move initial affinity setup to irq_startup()
The startup vs. setaffinity ordering of interrupts depends on the
IRQF_NOAUTOEN flag. Chained interrupts are not getting any affinity
assignment at all.

A regular interrupt is started up and then the affinity is set. A
IRQF_NOAUTOEN marked interrupt is not started up, but the affinity is set
nevertheless.

Move the affinity setup to startup_irq() so the ordering is always the same
and chained interrupts get the proper default affinity assigned as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.020534783@linutronix.de
2017-06-22 18:21:15 +02:00
Thomas Gleixner
43564bd97d genirq: Rename setup_affinity() to irq_setup_affinity()
Rename it with a proper irq_ prefix and make it available for other files
in the core code. Preparatory patch for moving the irq affinity setup
around.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.928501004@linutronix.de
2017-06-22 18:21:14 +02:00
Thomas Gleixner
cba4235e60 genirq: Remove mask argument from setup_affinity()
No point to have this alloc/free dance of cpumasks. Provide a static mask
for setup_affinity() and protect it proper.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.851571573@linutronix.de
2017-06-22 18:21:14 +02:00
Thomas Gleixner
cdd16365b0 genirq: Provide irq_fixup_move_pending()
If an CPU goes offline, the interrupts are migrated away, but a eventually
pending interrupt move, which has not yet been made effective is kept
pending even if the outgoing CPU is the sole target of the pending affinity
mask. What's worse is, that the pending affinity mask is discarded even if
it would contain a valid subset of the online CPUs.

Implement a helper function which allows to avoid these issues.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.691345468@linutronix.de
2017-06-22 18:21:13 +02:00
Thomas Gleixner
087cdfb662 genirq/debugfs: Add proper debugfs interface
Debugging (hierarchical) interupt domains is tedious as there is no
information about the hierarchy and no information about states of
interrupts in the various domain levels.

Add a debugfs directory 'irq' and subdirectories 'domains' and 'irqs'.

The domains directory contains the domain files. The content is information
about the domain. If the domain is part of a hierarchy then the parent
domains are printed as well.

# ls /sys/kernel/debug/irq/domains/
default     INTEL-IR-2	    INTEL-IR-MSI-2  IO-APIC-IR-2  PCI-MSI
DMAR-MSI    INTEL-IR-3	    INTEL-IR-MSI-3  IO-APIC-IR-3  unknown-1
INTEL-IR-0  INTEL-IR-MSI-0  IO-APIC-IR-0    IO-APIC-IR-4  VECTOR
INTEL-IR-1  INTEL-IR-MSI-1  IO-APIC-IR-1    PCI-HT

# cat /sys/kernel/debug/irq/domains/VECTOR 
name:   VECTOR
 size:   0
 mapped: 216
 flags:  0x00000041

# cat /sys/kernel/debug/irq/domains/IO-APIC-IR-0 
name:   IO-APIC-IR-0
 size:   24
 mapped: 19
 flags:  0x00000041
 parent: INTEL-IR-3
    name:   INTEL-IR-3
     size:   65536
     mapped: 167
     flags:  0x00000041
     parent: VECTOR
        name:   VECTOR
         size:   0
         mapped: 216
         flags:  0x00000041

Unfortunately there is no per cpu information about the VECTOR domain (yet).

The irqs directory contains detailed information about mapped interrupts.

# cat /sys/kernel/debug/irq/irqs/3
handler:  handle_edge_irq
status:   0x00004000
istate:   0x00000000
ddepth:   1
wdepth:   0
dstate:   0x01018000
            IRQD_IRQ_DISABLED
            IRQD_SINGLE_TARGET
            IRQD_MOVE_PCNTXT
node:     0
affinity: 0-143
effectiv: 0
pending:  
domain:  IO-APIC-IR-0
 hwirq:   0x3
 chip:    IR-IO-APIC
  flags:   0x10
             IRQCHIP_SKIP_SET_WAKE
 parent:
    domain:  INTEL-IR-3
     hwirq:   0x20000
     chip:    INTEL-IR
      flags:   0x0
     parent:
        domain:  VECTOR
         hwirq:   0x3
         chip:    APIC
          flags:   0x0

This was developed to simplify the debugging of the managed affinity
changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.537566163@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-22 18:21:13 +02:00
Thomas Gleixner
9dc6be3d41 genirq/irqdomain: Add map counter
Add a map counter instead of counting radix tree entries for
diagnosis. That also gives correct information for linear domains.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.459397746@linutronix.de
2017-06-22 18:21:12 +02:00
Thomas Gleixner
d59f6617ee genirq: Allow fwnode to carry name information only
In order to provide proper debug interface it's required to have domain
names available when the domain is added. Non fwnode based architectures
like x86 have no way to do so.

It's not possible to use domain ops or host data for this as domain ops
might be the same for several instances, but the names have to be unique.

Extend the irqchip fwnode to allow transporting the domain name. If no node
is supplied, create a 'unknown-N' placeholder.

Warn if an invalid node is supplied and treat it like no node. This happens
e.g. with i2 devices on x86 which hand in an ACPI type node which has no
interface for retrieving the name.

[ Folded a fix from Marc to make DT name parsing work ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.588784933@linutronix.de
2017-06-22 18:21:08 +02:00
Thomas Gleixner
0165308a2f genirq/msi: Prevent overwriting domain name
Prevent overwriting an already assigned domain name. Remove the extra check
for chip->name, because if domain->name is NULL overwriting it with NULL is
not a problem.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.510684976@linutronix.de
2017-06-22 18:21:08 +02:00
Bartosz Golaszewski
30fd8fc5c9 irq/generic-chip: Provide devm_irq_setup_generic_chip()
Provide a resource managed variant of irq_setup_generic_chip().

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-6-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:11 +02:00
Bartosz Golaszewski
1c3e36309f irq/generic-chip: Provide devm_irq_alloc_generic_chip()
Provide a resource managed variant of irq_alloc_generic_chip().

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-5-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:11 +02:00
Bartosz Golaszewski
f160203986 irq/generic-chip: Export irq_init_generic_chip() locally
This function will be used in the devres variant of
irq_alloc_generic_chip().

Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-4-git-send-email-brgl@bgdev.pl
2017-06-21 15:53:11 +02:00
Thomas Gleixner
b50fb7c992 Merge branch 'linus' into irq/core
Get upstream changes so pending patches won't conflict.
2017-06-20 22:08:32 +02:00
Heiner Kallweit
fa07ab72cb genirq: Release resources in __setup_irq() error path
In case __irq_set_trigger() fails the resources requested via
irq_request_resources() are not released.

Add the missing release call into the error handling path.

Fixes: c1bacbae81 ("genirq: Provide irq_request/release_resources chip callbacks")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/655538f5-cb20-a892-ff15-fbd2dd1fa4ec@gmail.com
2017-06-13 00:40:39 +02:00
Thomas Gleixner
04c848d398 genirq: Warn when IRQ_NOAUTOEN is used with shared interrupts
Shared interrupts do not go well with disabling auto enable:

1) The sharing interrupt might request it while it's still disabled and
   then wait for interrupts forever.

2) The interrupt might have been requested by the driver sharing the line
   before IRQ_NOAUTOEN has been set. So the driver which expects that
   disabled state after calling request_irq() will not get what it wants.
   Even worse, when it calls enable_irq() later, it will trigger the
   unbalanced enable_irq() warning.

Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: dianders@chromium.org
Cc: jeffy <jeffy.chen@rock-chips.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: tfiga@chromium.org
Link: http://lkml.kernel.org/r/20170531100212.210682135@linutronix.de
2017-06-04 14:38:41 +02:00
Thomas Gleixner
201d7f47f3 genirq: Handle NOAUTOEN interrupt setup proper
If an interrupt is marked NOAUTOEN then request_irq() installs the action,
but does not enable the interrupt via startup_irq().  The interrupt is
enabled via enable_irq() later from the driver. enable_irq() calls
irq_enable().

That means that for interrupts which have a irq_startup() callback this
callback is never invoked. Neither is irq_domain_activate_irq() invoked for
such interrupts.

If an interrupt depends on irq_startup() or irq_domain_activate_irq() then
the enable via irq_enable() is not enough.

Add a status flag IRQD_IRQ_STARTED_UP and use this to select the proper
mechanism in enable_irq(). Use the flag also to avoid pointless calls into
the low level functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: dianders@chromium.org
Cc: jeffy <jeffy.chen@rock-chips.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: tfiga@chromium.org
Link: http://lkml.kernel.org/r/20170531100212.130986205@linutronix.de
2017-06-04 14:35:13 +02:00
Vincent Legoll
5a29ef2209 genirq: Make early_irq_init() print out more informative
The printk in early_irq_init() is cryptic and badly formatted:

  NR_IRQS:33024 nr_irqs:968 16

The last number is the number of preallocated interrupts, so add a prefix
to it:

  NR_IRQS: 33024, nr_irqs: 968, preallocated irqs: 16

Cleanup the formatting for better readability as well.

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1494318849-6733-1-git-send-email-vincent.legoll@gmail.com
2017-05-26 14:54:05 +02:00
Marc Zyngier
a97b852b4d genirq/msi: Populate the domain name if provided by the irqchip
In order to ease debug, let's populate the domain name upfront, before any
MSI gets requested. This allows the domain to appear in the
irq_domain_mapping, and the user to easily find the expected data.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170512115538.10767-4-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-05-22 22:29:45 +02:00
Marc Zyngier
2370c00dc7 irqdomain: Let irq_domain_mapping display ACPI fwnode attributes
If the system is using ACPI, there is no of_node to display. But ACPI can
use a struct irqchip_fwid as a domain identifier, and it can be used to
display the name contained in that structure.

The output on such a system will look like this:

 pMSI      0           0           0  irqchip@00000000e1180000
 MSI      37           0           0  irqchip@00000000e1180000
 GICv2m   37           0           0  irqchip@00000000e1180000
 GICv2   448         448           0  irqchip@ffff000008003000

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170512115538.10767-3-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-05-22 22:29:44 +02:00
Marc Zyngier
fe17a42e70 irqdomain: Let irq_domain_mapping display hierarchical domains
Hierarchical domains seem to be hard to grasp, and a number of
aspiring kernel hackers find them utterly discombobulating.

In order to ease their pain, let's make them appear in
/sys/kernel/debug/irq_domain_mapping, such as the following:

   96  0x81808  MSI    0x          (null) RADIX   MSI
   96+ 0x00063  GICv2m 0xffff8003ee116980 RADIX   GICv2m
   96+ 0x00063  GICv2  0xffff00000916bfd8 LINEAR  GICv2

[output compressed to fit in a commit log]

This shows that IRQ96 is implemented by a stack of three domains,
the + sign indicating the stacking.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170512115538.10767-2-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-05-22 22:29:44 +02:00
Michael Hernandez
6f9a22bc57 PCI/MSI: Ignore affinity if pre/post vector count is more than min_vecs
min_vecs is the minimum amount of vectors needed to operate in MSI-X mode
which may just include the vectors that don't need affinity.

Disabling affinity settings causes the qla2xxx driver scsi_add_host() to fail
when blk_mq is enabled as the blk_mq_pci_map_queues() expects affinity masks
on each vector.

Fixes: dfef358bd1 ("PCI/MSI: Don't apply affinity if there aren't enough vectors left")
Signed-off-by: Michael Hernandez <michael.hernandez@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org	# v4.10+
2017-05-22 15:06:05 -05:00
Linus Torvalds
e7a3d62749 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A set of small fixes for the irq subsystem:

   - Cure a data ordering problem with chained interrupts

   - Three small fixlets for the mbigen irq chip"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Fix chained interrupt data ordering
  irqchip/mbigen: Fix the clear register offset calculation
  irqchip/mbigen: Fix potential NULL dereferencing
  irqchip/mbigen: Fix memory mapping code
2017-05-21 11:45:26 -07:00
Thomas Gleixner
2c4569ca26 genirq: Fix chained interrupt data ordering
irq_set_chained_handler_and_data() sets up the chained interrupt and then
stores the handler data.

That's racy against an immediate interrupt which gets handled before the
store of the handler data happened. The handler will dereference a NULL
pointer and crash.

Cure it by storing handler data before installing the chained handler.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
2017-05-16 15:03:26 +02:00
Mauro Carvalho Chehab
c0c6e08505 irq: update genericirq book location
This book got converted from DocBook. Update its references to
point to the current location.

Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-05-16 08:44:21 -03:00
Linus Torvalds
857f864014 pci-v4.12-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZEHmsAAoJEFmIoMA60/r88SgQAJbFddueb0+DfJ+USDud4b/Z
 akfS+G1UAm+TgtMyh1wM49dHzFssp36uWJxtWI+bPqBzuy94PMCbz7JVUV28gX9G
 tFhFuc5YH94I/3y85rbZnolb6uZN9MhLjzTFqDC9ilW6HFqmwK4t4wlHSCjQN1St
 svLYvs2G6n6/VK3Fre7/wOvdZ1erG4Qod+kn5Tx3K5TQydmRlaSBfK+DRANuDBkM
 KzGO7Bkc/Cx8hb9pHmaey/wxmNrrgmVjTtWrEnb2tEq833zP4h6GhUIJEKodMSi5
 gXPNZgKlu3n5L592M0UCh4EoHejzkv9wrcsoDm+djmsc5Zg2Howq4kAdHP8k4hUG
 0gt8n0ni9vhJN56jikrGi7cAdHCKSNnx2Ue/qTCbX0ncB3XUMuJxJwCsgW/6wa9f
 oU7tRtTS03UltnKoFAcyYclS4TaSY4SA4ySaK6Hi+cRkdVFDdyHQYbHHNSU7MsA+
 IS2tXvGoIdSYyrZMHSRcl2rRTfYQUkmPEvBF3LvqZr32M4mJMmUNAPLZaly373ZE
 iwq0ZJlrLeM0cqdFIG3S60RtJyQk/HBN1NMqrYHArWOxvWIgNd5F8NCsTTxY3wU3
 IxgBIuUFcbVwVkqEHGs8K5AvB3oghqdnA3eGOV79799eMtLn3LOvyIlpHMSw9WUq
 ags00JtMLitfNPBH3eSl
 =eE4D
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 - add framework for supporting PCIe devices in Endpoint mode (Kishon
   Vijay Abraham I)

 - use non-postable PCI config space mappings when possible (Lorenzo
   Pieralisi)

 - clean up and unify mmap of PCI BARs (David Woodhouse)

 - export and unify Function Level Reset support (Christoph Hellwig)

 - avoid FLR for Intel 82579 NICs (Sasha Neftin)

 - add pci_request_irq() and pci_free_irq() helpers (Christoph Hellwig)

 - short-circuit config access failures for disconnected devices (Keith
   Busch)

 - remove D3 sleep delay when possible (Adrian Hunter)

 - freeze PME scan before suspending devices (Lukas Wunner)

 - stop disabling MSI/MSI-X in pci_device_shutdown() (Prarit Bhargava)

 - disable boot interrupt quirk for ASUS M2N-LR (Stefan Assmann)

 - add arch-specific alignment control to improve device passthrough by
   avoiding multiple BARs in a page (Yongji Xie)

 - add sysfs sriov_drivers_autoprobe to control VF driver binding
   (Bodong Wang)

 - allow slots below PCI-to-PCIe "reverse bridges" (Bjorn Helgaas)

 - fix crashes when unbinding host controllers that don't support
   removal (Brian Norris)

 - add driver for MicroSemi Switchtec management interface (Logan
   Gunthorpe)

 - add driver for Faraday Technology FTPCI100 host bridge (Linus
   Walleij)

 - add i.MX7D support (Andrey Smirnov)

 - use generic MSI support for Aardvark (Thomas Petazzoni)

 - make Rockchip driver modular (Brian Norris)

 - advertise 128-byte Read Completion Boundary support for Rockchip
   (Shawn Lin)

 - advertise PCI_EXP_LNKSTA_SLC for Rockchip root port (Shawn Lin)

 - convert atomic_t to refcount_t in HV driver (Elena Reshetova)

 - add CPU IRQ affinity in HV driver (K. Y. Srinivasan)

 - fix PCI bus removal in HV driver (Long Li)

 - add support for ThunderX2 DMA alias topology (Jayachandran C)

 - add ThunderX pass2.x 2nd node MCFG quirk (Tomasz Nowicki)

 - add ITE 8893 bridge DMA alias quirk (Jarod Wilson)

 - restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
   (Manish Jaggi)

* tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
  PCI: Don't allow unbinding host controllers that aren't prepared
  ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
  MAINTAINERS: Add PCI Endpoint maintainer
  Documentation: PCI: Add userguide for PCI endpoint test function
  tools: PCI: Add sample test script to invoke pcitest
  tools: PCI: Add a userspace tool to test PCI endpoint
  Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
  misc: Add host side PCI driver for PCI test function device
  PCI: Add device IDs for DRA74x and DRA72x
  dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
  PCI: dwc: dra7xx: Workaround for errata id i870
  dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
  PCI: dwc: dra7xx: Add EP mode support
  PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
  dt-bindings: PCI: Add DT bindings for PCI designware EP mode
  PCI: dwc: designware: Add EP mode support
  Documentation: PCI: Add binding documentation for pci-test endpoint function
  ixgbe: Use pcie_flr() instead of duplicating it
  IB/hfi1: Use pcie_flr() instead of duplicating it
  PCI: imx6: Fix spelling mistake: "contol" -> "control"
  ...
2017-05-08 19:03:25 -07:00
Linus Torvalds
3cb6653552 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "Nothing exciting from the irq side for this merge window:

   - a new driver for a Mediatek SoC

   - ACPI support for ARM GICV3

   - support for shared nested interrupts

   - the usual pile of fixes and updates all over te place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
  irqchip/mbigen: Fix return value check in mbigen_device_probe()
  irqchip/mips-gic: Replace static map with dynamic
  irqchip/mips-gic: Remove device IRQ domain
  irqchip/mips-gic: Separate IPI reservation & usage tracking
  genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs
  genirq: Use cpumask_available() for check of cpumask variable
  cpumask: Add helper cpumask_available()
  irqchip/irq-imx-gpcv2: Clear OF_POPULATED flag
  irqchip/atmel-aic5: Handle suspend to RAM
  irqchip: Add Mediatek mtk-cirq driver
  dt-bindings: mtk-cirq: Add binding document
  irqchip/gic-v3-its: Add IORT hook for platform MSI support
  irqchip/mbigen: Add ACPI support
  irqchip/mbigen: Introduce mbigen_of_create_domain()
  irqchip/mbigen: Drop module owner
  platform-msi: Make platform_msi_create_device_domain() ACPI aware
  irqchip/gicv3-its: platform-msi: Scan MADT to create platform msi domain
  irqchip/gicv3-its: platform-msi: Refactor its_pmsi_init() to prepare for ACPI
  irqchip/gicv3-its: platform-msi: Refactor its_pmsi_prepare()
  irqchip/gic-v3-its: Keep the include header files in alphabetic order
  ...
2017-05-01 15:46:13 -07:00
Thomas Gleixner
7a258ff04f Merge branch 'linus' into irq/core
Pick up upstream fixes to avoid conflicts with pending patches.
2017-04-20 16:05:13 +02:00
Keith Busch
b72f8051f3 genirq/affinity: Fix calculating vectors to assign
The vectors_per_node is calculated from the remaining available vectors.
The current vector starts after pre_vectors, so we need to subtract that
from the current to properly account for the number of remaining vectors
to assign.

Fixes: 3412386b53 ("irq/affinity: Fix extra vecs calculation")
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Link: http://lkml.kernel.org/r/1492645870-13019-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-20 16:03:09 +02:00
Christoph Hellwig
25ce4be724 genirq: Return the IRQ name from free_irq()
This allows callers to get back at them instead of having to store it in
another variable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-18 13:40:00 -05:00
Christoph Hellwig
a7e60e55d7 genirq: Fix indentation in remove_irq()
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-18 13:39:20 -05:00
Hans de Goede
382bd4de61 genirq: Use irqd_get_trigger_type to compare the trigger type for shared IRQs
When requesting a shared irq with IRQF_TRIGGER_NONE then the irqaction
flags get filled with the trigger type from the irq_data:

        if (!(new->flags & IRQF_TRIGGER_MASK))
                new->flags |= irqd_get_trigger_type(&desc->irq_data);

On the first setup_irq() the trigger type in irq_data is NONE when the
above code executes, then the irq is started up for the first time and
then the actual trigger type gets established, but that's too late to fix
up new->flags.

When then a second user of the irq requests the irq with IRQF_TRIGGER_NONE
its irqaction's triggertype gets set to the actual trigger type and the
following check fails:

        if (!((old->flags ^ new->flags) & IRQF_TRIGGER_MASK))

Resulting in the request_irq failing with -EBUSY even though both
users requested the irq with IRQF_SHARED | IRQF_TRIGGER_NONE

Fix this by comparing the new irqaction's trigger type to the trigger type
stored in the irq_data which correctly reflects the actual trigger type
being used for the irq.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20170415100831.17073-1-hdegoede@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-15 15:42:43 +02:00