Commit graph

12414 commits

Author SHA1 Message Date
Linus Torvalds
98f7fdced2 Two fixes:
- Fix a MIPS IRQ handling RCU bug
  - Remove a DocBook annotation for a parameter that doesn't exist anymore
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDq9IcRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gOnA//U6VcTbivf1QG05Omof+378iFZWWSdDHP
 /iF12DrtcalqCH73o4iuiEg3W8PdISqvLh5QI9G1WvKNXOmX/RRoFPpd1W24uM/D
 DM9m3hEjjEX3AmRZoQ7yrA2Jzd435DKQOHjzClV/5ykKiyBw533twHm7L9sI6WCl
 fsP5yiCMXUP2wS2Go/wWgRjrIIb5iQezXGGxeKhP4zW/lam86ntPcdm8I6EuRD7r
 bT4TXmJkjk98FL+A6QSk8cP+ZlzauHkNa0HYiSz1pvOnwPDYs8IXacCEeOwog3Wr
 hH0IXbX2NXJIUPDQc/BvWzOMxRNBgMZ1ixnj8B2kaDfIM5YfaGAV3/GuogFCUSns
 1u9E+LkLohvE61w7m49PjnYIlWSkuQFrMlJL+69O2wiqmbFX9aQa/T9i7Mq7bmsv
 Q5qTe935QRVcraUPQ5h0uJnKmBL8xteHpQSSrF5ppsC8cxEDX/CC/GsqAn0xG15Q
 VZ91ocXr58zCXDpLbesULY7LbGyB6c1Kt55WYxhp/IOtRhEpVzD5Y9ukXYdckIpq
 25TMFsPyTUTaS3r8nD8G0QeW3CLVj8Lzi/r0P3oa8GvgyqxlX7jzXI8aGPRb7QIi
 JtX6iY2nH0DZfp600VMJNUS67yMDuDXJ2Me5WY3XkvpYuFtPGRm41HOs3dEV+Ay1
 nxBMxUC/6Ug=
 =xghF
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
 "Two fixes:

   - Fix a MIPS IRQ handling RCU bug

   - Remove a DocBook annotation for a parameter that doesn't exist
     anymore"

* tag 'irq-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry
  genirq/irqdesc: Drop excess kernel-doc entry @lookup
2021-07-11 11:17:57 -07:00
Linus Torvalds
81361b837a Kbuild updates for v5.14
- Increase the -falign-functions alignment for the debug option.
 
  - Remove ugly libelf checks from the top Makefile.
 
  - Make the silent build (-s) more silent.
 
  - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified.
 
  - Various script cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmDon90VHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGWFUP/RGNwlGD/YV1xg0ZmM0/ynBzzOy2
 3dcr3etJZpipQDeqnHy3jt0esgMVlbkTdrHvP+2hpNaeXFwjF1fDHjhur9m8ZkVD
 efOA6nugOnNwhy2G3BvtCJv+Vhb+KZ0nNLB27z3Bl0LGP6LJdMRNAxFBJMv4k3aR
 F3sABugwCpnT2/YtuprxRl2/3/CyLur5NjY24FD+ugON3JIWfl6ETbHeFmxr1JE4
 mE+zaN5AwYuSuH9LpdRy85XVCcW/FFqP/DwOFllVvCCCNvvS0KWYSNHWfEsKdR75
 hmAAaS/rpi2eaL0vp88sNhAtYnhMSf+uFu0fyfYeWZuJqMt4Xz5xZKAzDsifCdif
 aQ6UEPDjiKABh9gpX26BMd2CXzkGR+L4qZ7iBPfO586Iy7opajrFX9kIj5U7ZtCl
 wsPat/9+18xpVJOTe0sss3idId7Ft4cRoW5FQMEAW2EWJ9fXAG1yDxEREj1V5gFx
 sMXtpmCoQag968qjfARvP08s3MB1P4Ij6tXcioGqHuEWeJLxOMK/KWyafQUg611d
 0kSWNO0OMo+odBj6j/vM+MIIaPhgwtZnPgw2q4uHGMcemzQxaEvGW+G/5a5qEpTv
 SKm8W24wXplNot4tuTGWq5/jANRJcMvVsyC48DYT81OZEOWrIc0kDV4v4qZToTxW
 97jn1NKa2H6L0J1V
 =Za8V
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Increase the -falign-functions alignment for the debug option.

 - Remove ugly libelf checks from the top Makefile.

 - Make the silent build (-s) more silent.

 - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified.

 - Various script cleanups

* tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (27 commits)
  scripts: add generic syscallnr.sh
  scripts: check duplicated syscall number in syscall table
  sparc: syscalls: use pattern rules to generate syscall headers
  parisc: syscalls: use pattern rules to generate syscall headers
  nds32: add arch/nds32/boot/.gitignore
  kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set
  kbuild: modpost: Explicitly warn about unprototyped symbols
  kbuild: remove trailing slashes from $(KBUILD_EXTMOD)
  kconfig.h: explain IS_MODULE(), IS_ENABLED()
  kconfig: constify long_opts
  scripts/setlocalversion: simplify the short version part
  scripts/setlocalversion: factor out 12-chars hash construction
  scripts/setlocalversion: add more comments to -dirty flag detection
  scripts/setlocalversion: remove workaround for old make-kpkg
  scripts/setlocalversion: remove mercurial, svn and git-svn supports
  kbuild: clean up ${quiet} checks in shell scripts
  kbuild: sink stdout from cmd for silent build
  init: use $(call cmd,) for generating include/generated/compile.h
  kbuild: merge scripts/mkmakefile to top Makefile
  sh: move core-y in arch/sh/Makefile to arch/sh/Kbuild
  ...
2021-07-10 11:01:38 -07:00
Linus Torvalds
379cf80a98 - fix for accesing gic via vdso
- two build fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmDpjpwaHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHBldw//fThtL09pNt3SfElnclT5
 wozVB0ViUtLRD26q4XAJjoMDb1M8GV3LaJ7AKu+BiwD8HLbIBZ1ZSZsqM/3lb9Ez
 /PusTuvzu06tH48IatRCwRWYnswT4RTiuyeFokDZaX2pWlbuW3L+z1JX7PUFi/0y
 ulu/4bAfj/5up/vNSr2HEHHpVT5vMNpUkpbQwxC+Oy5Nn+prVb4fqGa4YkJJj0f9
 YvzdJYJUp0u1p3YKd9EwfDIgA2KgwKUKp1sJahCkv88DjUv1UbESfMvPrKcB1yxn
 yd1hVcb4QgwslhDHKKVeIvPSp0yAndng/0TA3Fx8s1eXx07RohYI6QVQs+UbXPlV
 CTkoBxrvNT922w/5lUSsCu01NGaO43sxAzygetEjjLXVhbcjHvQiNYfNqqNzRcL1
 helTLMmI9Y4sj0PflqZenQwiwe/QJiWMvuVIibEst5o+eHccyqhD6ylmrGTqAZqw
 mEqBSLeKJw0TGuY/B+JD1hjiXVk2OYFL63Y9oNauj2U5bXP1K8Tpg5+u6TYZFN6d
 f4iO17/BmgZRisCbxAR2sGFP/hclquwAmxARnJ6/cSgOvl6P34qzIcnmqXVOn5g0
 CxA3pLoEm7BmyUTgpceWAFqlY650tpEYjeVW+glRcw9w2s0pAYyO7KVtXK32e7mX
 8TTNg2i9P+RJO/25f5k+Lq8=
 =Lk+H
 -----END PGP SIGNATURE-----

Merge tag 'mips_5.14_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - fix for accesing gic via vdso

 - two build fixes

* tag 'mips_5.14_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: vdso: Invalid GIC access through VDSO
  mips: disable branch profiling in boot/decompress.o
  mips: always link byteswap helpers into decompressor
2021-07-10 10:36:33 -07:00
Thomas Gleixner
4840048356 irqchip fixes for 5.14, take #1
- Fix a MIPS bug where irqdomain loopkups could occur in a context
   where RCU is not allowed
 
 - Fix a documentation bug for handle_domain_irq
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmDoFZwPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDkxgP/0IlcPoBfamLbzpjyrMm0359AIllKP7Vgpb6
 IaDnoiWTCe1B6rFoP59j357iGCTvB04+oJ17DsB51R9RKaM2sdhi2dKxbEHRd+se
 v1EyU5gmVB7JJP/1lPnjYdPGnXz73jjEq5MRI6o4yLG2xnQ+ZcdHifuP/tD7glGZ
 UmJkAWrq53IbBCja9HbtbAQanuDtDu6xgYIweaCdf2oKzNS9jQVvjqFokMd0AcIA
 juY8xTFNzHoA/8AF8eBUE5TfLVcG8j3P31ffw4gVvzEkman77AP5DZ8qkSvi7plH
 wOdjlBrsGTfoti4kfAIvsfi6zBHyXJEaW0Vd38uaA+cXYI1QNiZ9qqzQPvjuK9gs
 VFORcWe2pDiI1q4mg1pz7dGBy0FoEswe4uVnr6xm+vn98KeAn/gS5Dc/K1JpmfN+
 iOJt9H7hvDrUC/KnntzuRY82I8y7gDyyjGDJHhFlWgZBeONPhpOiE/d6xbxGtQm8
 SpVBD9QZnMBxDY2eVNQp31SCdrwLCLczNeQrJHP6Oh9LLmjQq3VD4M3OpBLpEA8e
 8WIiH9vJfXI5emU1wQRkscyA8tT2mAo3Kb192fpC5nDwTMSxbfk1l94oLGUldcV5
 liQ78yd/12mJBMS5MJPsA39g1ww2vv5m6Bq0JDwDu33l1qKrlRJqveeS8UjDSXrD
 s3cVrxFO
 =eTw3
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-fixes-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip fixes from Marc Zyngier:

 - Fix a MIPS bug where irqdomain loopkups could occur in a context
   where RCU is not allowed

 - Fix a documentation bug for handle_domain_irq
2021-07-09 15:35:13 +02:00
Martin Fäcknitz
47ce8527fb MIPS: vdso: Invalid GIC access through VDSO
Accessing raw timers (currently only CLOCK_MONOTONIC_RAW) through VDSO
doesn't return the correct time when using the GIC as clock source.
The address of the GIC mapped page is in this case not calculated
correctly. The GIC mapped page is calculated from the VDSO data by
subtracting PAGE_SIZE:

  void *get_gic(const struct vdso_data *data) {
    return (void __iomem *)data - PAGE_SIZE;
  }

However, the data pointer is not page aligned for raw clock sources.
This is because the VDSO data for raw clock sources (CS_RAW = 1) is
stored after the VDSO data for coarse clock sources (CS_HRES_COARSE = 0).
Therefore, only the VDSO data for CS_HRES_COARSE is page aligned:

  +--------------------+
  |                    |
  | vd[CS_RAW]         | ---+
  | vd[CS_HRES_COARSE] |    |
  +--------------------+    | -PAGE_SIZE
  |                    |    |
  |  GIC mapped page   | <--+
  |                    |
  +--------------------+

When __arch_get_hw_counter() is called with &vd[CS_RAW], get_gic returns
the wrong address (somewhere inside the GIC mapped page). The GIC counter
values are not returned which results in an invalid time.

Fixes: a7f4df4e21 ("MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()")
Signed-off-by: Martin Fäcknitz <faecknitz@hotsplots.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-09 15:29:06 +02:00
Marc Zyngier
1fee9db9b4 irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry
Since d4a45c68dc ("irqdomain: Protect the linear revmap with RCU"),
any irqdomain lookup requires the RCU read lock to be held.

This assumes that the architecture code will be structured such as
irq_enter() will be called *before* the interrupt is looked up
in the irq domain. However, this isn't the case for MIPS, and a number
of drivers are structured to do it the other way around when handling
an interrupt in their root irqchip (secondary irqchips are OK by
construction).

This results in a RCU splat on a lockdep-enabled kernel when the kernel
takes an interrupt from idle, as reported by Guenter Roeck.

Note that this could have fired previously if any driver had used
tree-based irqdomain, which always had the RCU requirement.

To solve this, provide a MIPS-specific helper (do_domain_IRQ())
as the pendent of do_IRQ() that will do thing in the right order
(and maybe save some cycles in the process).

Ideally, MIPS would be moved over to using handle_domain_irq(),
but that's much more ambitious.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[maz: add dependency on CONFIG_IRQ_DOMAIN after report from the kernelci bot]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20210705172352.GA56304@roeck-us.net
Link: https://lore.kernel.org/r/20210706110647.3979002-1-maz@kernel.org
2021-07-09 10:18:58 +01:00
Aneesh Kumar K.V
dc4875f0e7 mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *
No functional change in this patch.

[aneesh.kumar@linux.ibm.com: m68k build error reported by kernel robot]
  Link: https://lkml.kernel.org/r/87tulxnb2v.fsf@linux.ibm.com

Link: https://lkml.kernel.org/r/20210615110859.320299-2-aneesh.kumar@linux.ibm.com
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:22 -07:00
Aneesh Kumar K.V
9cf6fa2458 mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *
No functional change in this patch.

[aneesh.kumar@linux.ibm.com: fix]
  Link: https://lkml.kernel.org/r/87wnqtnb60.fsf@linux.ibm.com
[sfr@canb.auug.org.au: another fix]
  Link: https://lkml.kernel.org/r/20210619134410.89559-1-aneesh.kumar@linux.ibm.com

Link: https://lkml.kernel.org/r/20210615110859.320299-1-aneesh.kumar@linux.ibm.com
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:22 -07:00
Linus Torvalds
c932ed0adb TTY / Serial patches for 5.14-rc1
Here is the big set of tty and serial driver patches for 5.14-rc1.
 
 A bit more than normal, but nothing major, lots of cleanups.  Highlights
 are:
 	- lots of tty api cleanups and mxser driver cleanups from Jiri
 	- build warning fixes
 	- various serial driver updates
 	- coding style cleanups
 	- various tty driver minor fixes and updates
 	- removal of broken and disable r3964 line discipline (finally!)
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYOM4qQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylKvQCfbh+OmTkDlDlDhSWlxuV05M1XTXoAoLUcLZru
 s5JCnwSZztQQLMDHj7Pd
 =Zupm
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial updates from Greg KH:
 "Here is the big set of tty and serial driver patches for 5.14-rc1.

  A bit more than normal, but nothing major, lots of cleanups.
  Highlights are:

   - lots of tty api cleanups and mxser driver cleanups from Jiri

   - build warning fixes

   - various serial driver updates

   - coding style cleanups

   - various tty driver minor fixes and updates

   - removal of broken and disable r3964 line discipline (finally!)

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (227 commits)
  serial: mvebu-uart: remove unused member nb from struct mvebu_uart
  arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART
  dt-bindings: mvebu-uart: fix documentation
  serial: mvebu-uart: correctly calculate minimal possible baudrate
  serial: mvebu-uart: do not allow changing baudrate when uartclk is not available
  serial: mvebu-uart: fix calculation of clock divisor
  tty: make linux/tty_flip.h self-contained
  serial: Prefer unsigned int to bare use of unsigned
  serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs
  serial: qcom_geni_serial: use DT aliases according to DT bindings
  Revert "tty: serial: Add UART driver for Cortina-Access platform"
  tty: serial: Add UART driver for Cortina-Access platform
  MAINTAINERS: add me back as mxser maintainer
  mxser: Documentation, fix typos
  mxser: Documentation, make the docs up-to-date
  mxser: Documentation, remove traces of callout device
  mxser: introduce mxser_16550A_or_MUST helper
  mxser: rename flags to old_speed in mxser_set_serial_info
  mxser: use port variable in mxser_set_serial_info
  mxser: access info->MCR under info->slock
  ...
2021-07-05 14:08:24 -07:00
Linus Torvalds
a16d8644ba Staging / IIO driver patches for 5.14-rc1
Here is the big set of IIO and staging driver patches for 5.14-rc1.
 
 Loads of IIO driver updates and additions in here, the shortlog has the
 full details.
 
 For the staging side, we moved a few drivers out of staging, and deleted
 the kpc2000 drivers as the original developer asked us to because no one
 was working on them anymore.
 
 Also in here are loads of coding style cleanups due to different intern
 projects focusing on the staging tree to try to get experience doing
 kernel development.
 
 All of these have been in the linux-next tree for a while with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYOM50w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykZ4wCeK/JreZijlAy0O5Gq1equvRx1jJoAoJmmt7UY
 bx6qpcmUM7c53cMXr/kh
 =6suo
 -----END PGP SIGNATURE-----

Merge tag 'staging-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging / IIO driver updates from Greg KH:
 "Here is the big set of IIO and staging driver patches for 5.14-rc1.

  Loads of IIO driver updates and additions in here, the shortlog has
  the full details.

  For the staging side, we moved a few drivers out of staging, and
  deleted the kpc2000 drivers as the original developer asked us to
  because no one was working on them anymore.

  Also in here are loads of coding style cleanups due to different
  intern projects focusing on the staging tree to try to get experience
  doing kernel development.

  All of these have been in the linux-next tree for a while with no
  reported problems"

* tag 'staging-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (744 commits)
  staging: hi6421-spmi-pmic: cleanup some macros
  staging: hi6421-spmi-pmic: change identation of a table
  staging: hi6421-spmi-pmic: change a return code
  staging: hi6421-spmi-pmic: better name IRQs
  staging: hi6421-spmi-pmic: use devm_request_threaded_irq()
  staging: hisilicon,hi6421-spmi-pmic.yaml: cleanup descriptions
  spmi: hisi-spmi-controller: move driver from staging
  phy: phy-hi3670-usb3: move driver from staging into phy
  staging: rtl8188eu: remove include/rtw_debug.h header
  staging: rtl8188eu: remove GlobalDebugLevel variable
  staging: rtl8188eu: remove DRIVER_PREFIX preprocessor definition
  staging: rtl8188eu: remove RT_TRACE macro
  staging: rtl8188eu: remove all RT_TRACE calls from hal/rtl8188eu_recv.c
  staging: rtl8188eu: remove all RT_TRACE calls from hal/hal_intf.c
  staging: rtl8188eu: remove all RT_TRACE calls from hal/rtl8188eu_xmit.c
  staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_xmit.c
  staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_pwrctrl.c
  staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_recv.c
  staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_ioctl_set.c
  staging: rtl8188eu: remove all RT_TRACE calls from core/rtw_ieee80211.c
  ...
2021-07-05 14:01:53 -07:00
Randy Dunlap
97e488073c mips: disable branch profiling in boot/decompress.o
Use DISABLE_BRANCH_PROFILING for arch/mips/boot/compressed/decompress.o
to prevent linkage errors.

mips64-linux-ld: arch/mips/boot/compressed/decompress.o: in function `LZ4_decompress_fast_extDict':
decompress.c:(.text+0x8c): undefined reference to `ftrace_likely_update'
mips64-linux-ld: decompress.c:(.text+0xf4): undefined reference to `ftrace_likely_update'
mips64-linux-ld: decompress.c:(.text+0x200): undefined reference to `ftrace_likely_update'
mips64-linux-ld: decompress.c:(.text+0x230): undefined reference to `ftrace_likely_update'
mips64-linux-ld: decompress.c:(.text+0x320): undefined reference to `ftrace_likely_update'
mips64-linux-ld: arch/mips/boot/compressed/decompress.o:decompress.c:(.text+0x3f4): more undefined references to `ftrace_likely_update' follow

Fixes: e76e1fdfa8 ("lib: add support for LZ4-compressed kernel")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-05 17:04:01 +02:00
Arnd Bergmann
cddc40f561 mips: always link byteswap helpers into decompressor
My series to clean up the unaligned access implementation
across architectures caused some mips randconfig builds to
fail with:

   mips64-linux-ld: arch/mips/boot/compressed/decompress.o: in function `decompress_kernel':
   decompress.c:(.text.decompress_kernel+0x54): undefined reference to `__bswapsi2'

It turns out that this problem has already been fixed for the XZ
decompressor but now it also shows up in (at least) LZO and LZ4.  From my
analysis I concluded that the compiler could always have emitted those
calls, but the different implementation allowed it to make otherwise
better decisions about not inlining the byteswap, which results in the
link error when the out-of-line code is missing.

While it could be addressed by adding it to the two decompressor
implementations that are known to be affected, but as this only adds
112 bytes to the kernel, the safer choice is to always add them.

Fixes: c50ec67875 ("MIPS: zboot: Fix the build with XZ compression on older GCC versions")
Fixes: 0652035a57 ("asm-generic: unaligned: remove byteshift helpers")
Link: https://lore.kernel.org/linux-mm/202106301304.gz2wVY9w-lkp@intel.com/
Link: https://lore.kernel.org/linux-mm/202106260659.TyMe8mjr-lkp@intel.com/
Link: https://lore.kernel.org/linux-mm/202106172016.onWT6Tza-lkp@intel.com/
Link: https://lore.kernel.org/linux-mm/202105231743.JJcALnhS-lkp@intel.com/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-05 17:03:11 +02:00
Linus Torvalds
4cad671979 asm-generic/unaligned: Unify asm/unaligned.h around struct helper
The get_unaligned()/put_unaligned() helpers are traditionally architecture
 specific, with the two main variants being the "access-ok.h" version
 that assumes unaligned pointer accesses always work on a particular
 architecture, and the "le-struct.h" version that casts the data to a
 byte aligned type before dereferencing, for architectures that cannot
 always do unaligned accesses in hardware.
 
 Based on the discussion linked below, it appears that the access-ok
 version is not realiable on any architecture, but the struct version
 probably has no downsides. This series changes the code to use the
 same implementation on all architectures, addressing the few exceptions
 separately.
 
 Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/
 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
 Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/
 Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2
 Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmDfFx4ACgkQmmx57+YA
 GNkqzRAAjdlIr8M+xI2CyT0/A9tswYfLMeWejmYopq3zlxI6RnvPiJJDIdY2I8US
 1npIiDo55w061CnXL9rV65ocL3XmGu1mabOvgM6ATsec+8t4WaXBV9tysxTJ9ea0
 ltLTa2P5DXWALvWiVMTME7hFaf1cW+8Uqt3LmXxDp2l5zasXajCHAH6YokON2PfM
 CsaRhwSxIu8Sbnu/IQGBI9JW5UXsBfKSyUwtM0OwP7jFOuIeZ4WBVA+j6UxONnFC
 wouKmAM/ThoOsaV9aP4EZLIfBx8d4/hfYQjZ958kYXurerruYkJeEqdIRbV0QqTy
 2O6ZrJ6uqPlzfWz9h458me2dt98YEtALHV/3DCWUcBfHmUQtxElyJYEhG0YjVF3H
 5RYtjw8Q2LS/QR5ask1Xn0JfT89rRnLi2migAtsA4Ce70JP4Us6wGobkj4SHlgDt
 P7+eVq2Mkhqw/kmV8N4p+ZS5lpkK0JniDN+ONDhkZqHL/zXG/HQzx9wLV69jlvo2
 ASevKxITdi+bKHWs5ANungkBOnBUQZacq46mVyi4HPDwMAFyWvVYTbFumy9koagQ
 o9NEgX3RsZcxxi7bU1xuFPFMLMlUQT3Nb30+84B4fKe9FmvHC1hizTiCnp7q4bZr
 z6a6AMHke7YLqKZOqzTJGRR3lPoZZDCb775SAd70LQp6XPZXOHs=
 =IY5U
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm/unaligned.h unification from Arnd Bergmann:
 "Unify asm/unaligned.h around struct helper

  The get_unaligned()/put_unaligned() helpers are traditionally
  architecture specific, with the two main variants being the
  "access-ok.h" version that assumes unaligned pointer accesses always
  work on a particular architecture, and the "le-struct.h" version that
  casts the data to a byte aligned type before dereferencing, for
  architectures that cannot always do unaligned accesses in hardware.

  Based on the discussion linked below, it appears that the access-ok
  version is not realiable on any architecture, but the struct version
  probably has no downsides. This series changes the code to use the
  same implementation on all architectures, addressing the few
  exceptions separately"

Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/
Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2
Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/

* tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: simplify asm/unaligned.h
  asm-generic: uaccess: 1-byte access is always aligned
  netpoll: avoid put_unaligned() on single character
  mwifiex: re-fix for unaligned accesses
  apparmor: use get_unaligned() only for multi-byte words
  partitions: msdos: fix one-byte get_unaligned()
  asm-generic: unaligned always use struct helpers
  asm-generic: unaligned: remove byteshift helpers
  powerpc: use linux/unaligned/le_struct.h on LE power7
  m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
  sh: remove unaligned access for sh4a
  openrisc: always use unaligned-struct header
  asm-generic: use asm-generic/unaligned.h for most architectures
2021-07-02 12:43:40 -07:00
Linus Torvalds
71bd934101 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "190 patches.

  Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
  vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
  migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
  zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
  core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
  signals, exec, kcov, selftests, compress/decompress, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
  ipc/util.c: use binary search for max_idx
  ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
  ipc: use kmalloc for msg_queue and shmid_kernel
  ipc sem: use kvmalloc for sem_undo allocation
  lib/decompressors: remove set but not used variabled 'level'
  selftests/vm/pkeys: exercise x86 XSAVE init state
  selftests/vm/pkeys: refill shadow register after implicit kernel write
  selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
  selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
  kcov: add __no_sanitize_coverage to fix noinstr for all architectures
  exec: remove checks in __register_bimfmt()
  x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
  hfsplus: report create_date to kstat.btime
  hfsplus: remove unnecessary oom message
  nilfs2: remove redundant continue statement in a while-loop
  kprobes: remove duplicated strong free_insn_page in x86 and s390
  init: print out unknown kernel parameters
  checkpatch: do not complain about positive return values starting with EPOLL
  checkpatch: improve the indented label test
  checkpatch: scripts/spdxcheck.py now requires python3
  ...
2021-07-02 12:08:10 -07:00
Linus Torvalds
19b4385922 - added support for OpeneEmbed SOM9331 board
- Ingenic fixes/improvments
 - other fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmDdyJQaHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHBlyRAAhj5dC1EyGsc0cWyzC8ZU
 Nh45vzMCSxa4mxAelNjuVKdE0h7gLeRa8/sKPV+EcS3ZFcSfZQeIHEbH9Na3EDS7
 KtUZmkjrHCDdRTh7kou7E7mb716HvoQEyq6d1VyZOahyqf2ZcjIsFinK+As+4wBV
 JcXJMcpUWemI4Ojm8cbmNWQW3V2Ty9qLNUa6BpmntbdOdYowTih9QWHv2u1YsOR7
 R6LJXdyo6V1RieeqfZaWXTQtN8yyXYhBewLIy0DxBAm329f5sRHUVd9/ZD2RMByD
 1weEhbs0jhmYZFSfkwZ8SjAb4GkusjNTnDiK4Rsuz6pQK0BIGNAV0Mrnedq+i5eD
 wrrTI1/envStDpj9XGlSNajcaQGpTza1V1uaCIm4EanOMMTBc8DTXaj7MCOlTD8j
 fkNE3ykfloVSSzZAWCmcpV9fBsFwQp3m+cWrtIAAnOJDK+NV5FfABUwFHKnBUpxG
 YOUNCed0WJFx++xrHKylt8hWILLEATLHh1h5vDsEYJi4gwt2KxCSYQBGgSYKa1At
 tOUa5UINTMvpe4U9GEjHh6f1VedtUNYUzXjD5g2cn62d81RCSx3hB1KkfQSKtaDw
 H9sR8d+rDFt4fK9T/HPZ3EyM2i9FkZNv+CulfGNd3M0vnF8+qX2xRehWrqhwBBi0
 2p7V9/t5TPhziAfJySKX+Jg=
 =j2kH
 -----END PGP SIGNATURE-----

Merge tag 'mips_5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Thomas Bogendoerfer:

 - add support for OpeneEmbed SOM9331 board

 - Ingenic fixes/improvments

 - other fixes and cleanups

* tag 'mips_5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (39 commits)
  MIPS: Fix PKMAP with 32-bit MIPS huge page support
  MIPS: CI20: Add second percpu timer for SMP.
  MIPS: CI20: Reduce clocksource to 750 kHz.
  MIPS: Ingenic: Add MAC syscon nodes for Ingenic SoCs.
  dt-bindings: clock: Add documentation for MAC PHY control bindings.
  MIPS: X1830: Respect cell count of common properties.
  MIPS: set mips32r5 for virt extensions
  MIPS: loongsoon64: Reserve memory below starting pfn to prevent Oops
  MIPS: MT extensions are not available on MIPS32r1
  mips/kvm: Use BUG_ON instead of if condition followed by BUG
  MIPS: OCTEON: octeon-usb: Use devm_platform_get_and_ioremap_resource()
  MIPS: add PMD table accounting into MIPS'pmd_alloc_one
  MIPS: Loongson64: fix spelling of SPDX tag
  MIPS: ingenic: rs90: Add dedicated VRAM memory region
  MIPS: ingenic: gcw0: Set codec to cap-less mode for FM radio
  MIPS: ingenic: jz4780: Fix I2C nodes to match DT doc
  MIPS: ingenic: Select CPU_SUPPORTS_CPUFREQ && MIPS_EXTERNAL_TIMER
  MIPS: Kconfig: ingenic: Ensure MACH_INGENIC_GENERIC selects all SoCs
  MIPS: cpu-probe: Fix FPU detection on Ingenic JZ4760(B)
  MIPS: boot: Support specifying UART port on Ingenic SoCs
  ...
2021-07-01 17:03:11 -07:00
Linus Torvalds
a32b344e6f This is the bulk of pin control changes for the v5.14 kernel:
New drivers:
 
 - Last merge window we created a driver for the Ralink RT2880.
   We are now moving the Ralink SoC pin control drivers out of the MIPS
   architecture code and into the pin control subsystem. This concerns
   RT288X, MT7620, RT305X, RT3883 and MT7621.
 
 - Qualcomm SM6125 SoC pin control driver.
 
 - Qualcomm spmi-gpio support for PM7325.
 
 - Qualcomm spmi-mpp also handles PMI8994 (just a compatible string)
 
 - Mediatek MT8365 SoC pin controller.
 
 - New device HID for the AMD GPIO controller.
 
 Improvements:
 
 - Pin bias config support for a slew of Renesas pin controllers.
 
 - Incremental improvements and non-urgent bug fixes to the Renesas
   SoC drivers.
 
 - Implement irq_set_wake on the AMD pin controller so we can wake
   up from external pin events.
 
 Misc:
 
 - Devicetree bindings for the Apple M1 pin controller, we will probably
   see a proper driver for this soon as well.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmDeTu0ACgkQQRCzN7AZ
 XXO2Yg/+LHbqYX8V+Ig1ZcY4p5bfbGyyC6QG6g3d/kzzCmsjHFgmDFQoZ+LoRx+p
 FRUSvmiR0VERMZCEepHsZgzns6ezzJfBt4Cu/388d4iYZppaETpQV47TzqY3eP7Q
 4Shu2wIKwd7C3vNrCifub0JOYAAEsqdlHd75g0bqhal9hgH/MgYQSq9F22/TKAFl
 hteFwyw5L4OwKIDUpqDOIcG8thhHYWrQy77/Pp82/TVnmO9gamt863dKBjIg6iF9
 c+pmIWI8K2mBhNO+epGG4VSroUudIBwKV88nwUjKSe+pu0VAU7lit/V0Uh1IhG0s
 FUHHGDeF62Ncn4SOYetlnSlKbQkhJaBDV2sDgQ3xzqvs1P3WEHRWqYIh1egq5iW6
 /KtpSlRLQ/aO+k0iN66pErpAfsGNFAxkqlCSypyJG7ROnb2rADzZ0ftEKQb8RzZb
 nypPupOO5/bFfQHbQtFORDaNu9MUTR5PR04eTPMoApG0nv7zY+kcJ6iJuKE9spLb
 ahoxLstfQ/fKK27yms72E6PqwanuUEzcQv7gjhuHmFEjNrW1ARUqoa5hpdAzhZOX
 20P8SZWkSeUZnqB26YQq+1U9p6wV0064Vp+jYY/wzQpV40dgX9oumiRkxCWCzpjt
 6mw6x9txlrEEu+2WadW8yZd4ewKvWFLEGI+C/83pnI5NF1Dp0Go=
 =Ajcr
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "This is the bulk of pin control changes for the v5.14 kernel. Not so
  much going on. No core changes, just drivers.

  The most interesting would be that MIPS Ralink is migrating to pin
  control and we have some bindings but not yet code for the Apple M1
  pin controller.

  New drivers:

   - Last merge window we created a driver for the Ralink RT2880. We are
     now moving the Ralink SoC pin control drivers out of the MIPS
     architecture code and into the pin control subsystem. This concerns
     RT288X, MT7620, RT305X, RT3883 and MT7621.

   - Qualcomm SM6125 SoC pin control driver.

   - Qualcomm spmi-gpio support for PM7325.

   - Qualcomm spmi-mpp also handles PMI8994 (just a compatible string)

   - Mediatek MT8365 SoC pin controller.

   - New device HID for the AMD GPIO controller.

  Improvements:

   - Pin bias config support for a slew of Renesas pin controllers.

   - Incremental improvements and non-urgent bug fixes to the Renesas
     SoC drivers.

   - Implement irq_set_wake on the AMD pin controller so we can wake up
     from external pin events.

  Misc:

   - Devicetree bindings for the Apple M1 pin controller, we will
     probably see a proper driver for this soon as well"

* tag 'pinctrl-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (54 commits)
  pinctrl: ralink: rt305x: add missing include
  pinctrl: stm32: check for IRQ MUX validity during alloc()
  pinctrl: zynqmp: some code cleanups
  drivers: qcom: pinctrl: Add pinctrl driver for sm6125
  dt-bindings: pinctrl: qcom: sm6125: Document SM6125 pinctrl driver
  dt-bindings: pinctrl: mcp23s08: add documentation for reset-gpios
  pinctrl: mcp23s08: Add optional reset GPIO
  pinctrl: mediatek: fix mode encoding
  pinctrl: mcp23s08: Fix missing unlock on error in mcp23s08_irq()
  pinctrl: bcm: Constify static pinmux_ops
  pinctrl: bcm: Constify static pinctrl_ops
  pinctrl: ralink: move RT288X SoC pinmux config into a new 'pinctrl-rt288x.c' file
  pinctrl: ralink: move MT7620 SoC pinmux config into a new 'pinctrl-mt7620.c' file
  pinctrl: ralink: move RT305X SoC pinmux config into a new 'pinctrl-rt305x.c' file
  pinctrl: ralink: move RT3883 SoC pinmux config into a new 'pinctrl-rt3883.c' file
  pinctrl: ralink: move MT7621 SoC pinmux config into a new 'pinctrl-mt7621.c' file
  pinctrl: ralink: move ralink architecture pinmux header into the driver
  pinctrl: single: config: enable the pin's input
  pinctrl: mtk: Fix mt8365 Kconfig dependency
  pinctrl: mcp23s08: fix race condition in irq handler
  ...
2021-07-01 16:57:14 -07:00
Linus Torvalds
514798d365 This round has a diffstat dominated by Qualcomm clk drivers. Honestly though
that's just a bunch of data so the diffstat reflects that. Looking beyond that
 there's just a bunch of updates all around in various clk drivers. Renesas and
 NXP (for i.MX) are two SoC vendors that have a lot of patches in here. Overall
 the driver changes look to be mostly enabling more clks and non-critical fixes
 that we could hold until the next merge window.
 
 I'm especially excited about the series from Arnd that graduates clkdev to be
 the only implementation of clk_get() and clk_put(). That's a good step in the
 right direction to migreate eveerything over to the common clk framework. Now
 we don't have to worry about clkdev specific details, they're just part of the
 clk API now.
 
 Core:
  - clkdev is now the only option, i.e. clk_get()/clk_put() is implemented in
    only one place in the kernel instead of in drivers/clk/clkdev.c and in
    architectures that want their own implementation
 
 New Drivers:
  - Texas Instruments' LMK04832 Ultra Low-Noise JESD204B Compliant Clock
    Jitter Cleaner With Dual Loop PLLs
  - Qualcomm MDM9607 GCC
  - Qualcomm SC8180X display clks
  - Qualcomm SM6125 GCC
  - Qualcomm SM8250 CAMCC (camera)
  - Renesas RZ/G2L SoC
  - Hisilicon hi3559A SoC
 
 Updates:
  - Stop using clock-output-names in ST clk drivers (yay!)
  - Support secure mode of STM32MP1 SoCs
  - Improve clock support for Actions S500 SoC
  - duty cycle setting support on qcom clks
  - Add TI am33xx spread spectrum clock support
  - Use determine_rate() for the Amlogic pll ops instead of round_rate()
  - Restrict Amlogic gp0/1 and audio plls range on g12a/sm1
  - Improve Amlogic axg-audio controller error on deferral
  - Add NNA clocks on Amlogic g12a
  - Reduce memory footprint of Rockchip PLL rate tables
  - A fix for the newly added Rockchip rk3568 clk driver
  - Exported clock for the newly added Rockchip video decoder
  - Remove audio ipg clock from i.MX8MP
  - Remove deprecated legacy clock binding for i.MX SCU clock driver
  - Use common clk-imx8qxp for both i.MX8QXP and i.MX8QM
  - Add multiple clocks to clk-imx8qxp driver (enet, hdmi, lcdif, audio,
    parallel interface)
  - Add dedicated clock ops for i.MX paralel interface
  - Different fixes for clocks controlled by ATF on i.MX SoCs
  - Add A53/A72 frequency scaling support i.MX clk-scu driver
  - Add special case for DCSS clock on suspend for i.MX clk-scu driver
  - Add parent save/restore on suspend/resume to i.MX clk-scu driver
  - Skip runtime PM enablement for CPU clocks in i.MX clk-scu driver
  - Remove the sys1_pll/sys2_pll clock gates for i.MX8MQ and their
    bindings
  - Tegra clk driver no longer deasserts resets on clk_enable as it
    gets in the way of certain power-up sequences
  - Fix compile testing for Tegra clk driver
  - One patch to fix a divider on the Allwinner v3s Audio PLL
  - Add support for CPU core clock boost modes on Renesas R-Car Gen3
  - Add ISPCS (Image Signal Processor) clocks on Renesas R-Car V3U
  - Switch SH/R-Mobile and R-Car "DIV6" clocks to .determine_rate()
    and improve support for multiple parents
  - Switch Renesas RZ/N1 divider clocks to .determine_rate()
  - Add ZA2 (Audio Clock Generator) clock on Renesas R-Car D3
  - Convert ar7 to common clk framework
  - Convert ralink to common clk framework
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmDbu3sRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSV+OA/9EEV3uuauFsxVm8ySX4T8amHAzE98asEX
 XldxMqBuGNnlqJn3A3LeGISKKafaRMkL/7xqBnTi9ZycDy1WRi2SiAKLTDoJCmi7
 ES32EBCO1O9D5uo4mYFsYgHUaxFmE+4tQbtDCttVt59yZEiiNPz0Lm8tWz5yuDzX
 IwCN8HrNShyL4dykTRUDuUkqrTg9sSqSvdG+XcyI24pgLtBWvJU32wIFfLN+/n9C
 JSyYwzHkajoeuv5kpAJ1IV/tzZgy77xQHunsatJWz1qJ1J2eFADWI2p3NVf88N21
 5Mw5xvikMJZ5Xq8pdZKiyEQOFfcxN/+k7hfc6eq3SDpbkaHPti9CX2rv9Uck6rdh
 Bigixsx9IHbQ+1CJAXZxcAJma/GwzoWW1irqzTQoChYgwlJIyPijFqbuJxqS4P0d
 9sEp0WvbdAEgnktiqs7gphki7Q04y2gUD3LKD6hz5sL0vZ+Dy1DY6olkWJefGrHo
 FDnEGf6gsP3vvvlJt5G2zeZQ/NzMKkfaIGLj/1hTtoLMaxpg282cmPXVUxD+ripW
 /GG/z14RdaHQXeMXduo+MeK5qUsO6LspnYown54IWilOOo1m/rfbun3yAFJaphG1
 ZQB+JDfeH8Cv6AYbNwbEpXyXyj2Rz5fGQjA31+97fCCxykZ+suBQkWqK/lUCmTyf
 ofwokRnKiYY=
 =YnCF
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
 "This round has a diffstat dominated by Qualcomm clk drivers. Honestly
  though that's just a bunch of data so the diffstat reflects that.
  Looking beyond that there's just a bunch of updates all around in
  various clk drivers. Renesas and NXP (for i.MX) are two SoC vendors
  that have a lot of patches in here.

  Overall the driver changes look to be mostly enabling more clks and
  non-critical fixes that we could hold until the next merge window.

  I'm especially excited about the series from Arnd that graduates
  clkdev to be the only implementation of clk_get() and clk_put().
  That's a good step in the right direction to migreate eveerything over
  to the common clk framework. Now we don't have to worry about clkdev
  specific details, they're just part of the clk API now.

  Core:
   - clkdev is now the only option, i.e. clk_get()/clk_put() is
     implemented in only one place in the kernel instead of in
     drivers/clk/clkdev.c and in architectures that want their own
     implementation

  New Drivers:
   - Texas Instruments' LMK04832 Ultra Low-Noise JESD204B Compliant
     Clock Jitter Cleaner With Dual Loop PLLs
   - Qualcomm MDM9607 GCC
   - Qualcomm SC8180X display clks
   - Qualcomm SM6125 GCC
   - Qualcomm SM8250 CAMCC (camera)
   - Renesas RZ/G2L SoC
   - Hisilicon hi3559A SoC

  Updates:
   - Stop using clock-output-names in ST clk drivers (yay!)
   - Support secure mode of STM32MP1 SoCs
   - Improve clock support for Actions S500 SoC
   - duty cycle setting support on qcom clks
   - Add TI am33xx spread spectrum clock support
   - Use determine_rate() for the Amlogic pll ops instead of
     round_rate()
   - Restrict Amlogic gp0/1 and audio plls range on g12a/sm1
   - Improve Amlogic axg-audio controller error on deferral
   - Add NNA clocks on Amlogic g12a
   - Reduce memory footprint of Rockchip PLL rate tables
   - A fix for the newly added Rockchip rk3568 clk driver
   - Exported clock for the newly added Rockchip video decoder
   - Remove audio ipg clock from i.MX8MP
   - Remove deprecated legacy clock binding for i.MX SCU clock driver
   - Use common clk-imx8qxp for both i.MX8QXP and i.MX8QM
   - Add multiple clocks to clk-imx8qxp driver (enet, hdmi, lcdif,
     audio, parallel interface)
   - Add dedicated clock ops for i.MX paralel interface
   - Different fixes for clocks controlled by ATF on i.MX SoCs
   - Add A53/A72 frequency scaling support i.MX clk-scu driver
   - Add special case for DCSS clock on suspend for i.MX clk-scu driver
   - Add parent save/restore on suspend/resume to i.MX clk-scu driver
   - Skip runtime PM enablement for CPU clocks in i.MX clk-scu driver
   - Remove the sys1_pll/sys2_pll clock gates for i.MX8MQ and their
     bindings
   - Tegra clk driver no longer deasserts resets on clk_enable as it
     gets in the way of certain power-up sequences
   - Fix compile testing for Tegra clk driver
   - One patch to fix a divider on the Allwinner v3s Audio PLL
   - Add support for CPU core clock boost modes on Renesas R-Car Gen3
   - Add ISPCS (Image Signal Processor) clocks on Renesas R-Car V3U
   - Switch SH/R-Mobile and R-Car "DIV6" clocks to .determine_rate() and
     improve support for multiple parents
   - Switch Renesas RZ/N1 divider clocks to .determine_rate()
   - Add ZA2 (Audio Clock Generator) clock on Renesas R-Car D3
   - Convert ar7 to common clk framework
   - Convert ralink to common clk framework"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (161 commits)
  clk: zynqmp: Handle divider specific read only flag
  clk: zynqmp: Use firmware specific mux clock flags
  clk: zynqmp: Use firmware specific divider clock flags
  clk: zynqmp: Use firmware specific common clock flags
  clk: lmk04832: Use of match table
  clk: lmk04832: Depend on SPI
  clk: stm32mp1: new compatible for secure RCC support
  dt-bindings: clock: stm32mp1 new compatible for secure rcc
  dt-bindings: reset: add MCU HOLD BOOT ID for SCMI reset domains on stm32mp15
  dt-bindings: reset: add IDs for SCMI reset domains on stm32mp15
  dt-bindings: clock: add IDs for SCMI clocks on stm32mp15
  reset: stm32mp1: remove stm32mp1 reset
  clk: hisilicon: Add clock driver for hi3559A SoC
  dt-bindings: Document the hi3559a clock bindings
  clk: si5341: Add sysfs properties to allow checking/resetting device faults
  clk: si5341: Add silabs,iovdd-33 property
  clk: si5341: Add silabs,xaxb-ext-clk property
  clk: si5341: Allow different output VDD_SEL values
  clk: si5341: Update initialization magic
  clk: si5341: Check for input clock presence and PLL lock on startup
  ...
2021-07-01 13:26:16 -07:00
Linus Torvalds
911a2997a5 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmDcl7AACgkQnJ2qBz9k
 QNnsBQf+LBAPsfykQ/f8EdHErO1lfbVTmwf2g/JzTkjrIVZTZ6Ic47aCIiFxgHU2
 Js9ufaPxpsbbopzpn2PAoCUzxNsZDqgXtnC03MOUAqoSFbAvgLHz2sQwjqeYJUGQ
 P6n7VipEA/qBVpQI5zeCUhHYcahoNrRjSLzaFnE2Z8CrQYQ6Ry9gVEhduvu2OTru
 62cWlAWlTJfx/FcR1Y0F/ZznnNSKMiAHcEe3F6Beztplg2ooq+z6FclJYrkmnxMq
 SXSOsqTCdi1/oFx36NpvLkykrIS9I7N/iqCnKwbm6X+nyZZKyAwYZhWVqkbozPPu
 +u1Ppq8o0IuWwEA6/UAmxgAO3m/Gkw==
 =tn0h
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull misc fs updates from Jan Kara:
 "The new quotactl_fd() syscall (remake of quotactl_path() syscall that
  got introduced & disabled in 5.13 cycle), and couple of udf, reiserfs,
  isofs, and writeback fixes and cleanups"

* tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  writeback: fix obtain a reference to a freeing memcg css
  quota: remove unnecessary oom message
  isofs: remove redundant continue statement
  quota: Wire up quotactl_fd syscall
  quota: Change quotactl_path() systcall to an fd-based one
  reiserfs: Remove unneed check in reiserfs_write_full_page()
  udf: Fix NULL pointer dereference in udf_symlink function
  reiserfs: add check for invalid 1st journal block
2021-07-01 12:06:39 -07:00
Andy Shevchenko
f39650de68 kernel.h: split out panic and oops helpers
kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out panic and
oops helpers.

There are several purposes of doing this:
- dropping dependency in bug.h
- dropping a loop by moving out panic_notifier.h
- unload kernel.h from something which has its own domain

At the same time convert users tree-wide to use new headers, although for
the time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.

[akpm@linux-foundation.org: thread_info.h needs limits.h]
[andriy.shevchenko@linux.intel.com: ia64 fix]
  Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com

Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:04 -07:00
Anshuman Khandual
1c2f7d14d8 mm/thp: define default pmd_pgtable()
Currently most platforms define pmd_pgtable() as pmd_page() duplicating
the same code all over.  Instead just define a default value i.e
pmd_page() for pmd_pgtable() and let platforms override when required via
<asm/pgtable.h>.  All the existing platform that override pmd_pgtable()
have been moved into their respective <asm/pgtable.h> header in order to
precede before the new generic definition.  This makes it much cleaner
with reduced code.

Link: https://lkml.kernel.org/r/1623646133-20306-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:03 -07:00
Anshuman Khandual
fac7757e1f mm: define default value for FIRST_USER_ADDRESS
Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the
same code all over.  Instead just define a generic default value (i.e 0UL)
for FIRST_USER_ADDRESS and let the platforms override when required.  This
makes it much cleaner with reduced code.

The default FIRST_USER_ADDRESS here would be skipped in <linux/pgtable.h>
when the given platform overrides its value via <asm/pgtable.h>.

Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Guo Ren <guoren@kernel.org>			[csky]
Acked-by: Stafford Horne <shorne@gmail.com>		[openrisc]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>	[RISC-V]
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:02 -07:00
David Hildenbrand
4ca9b3859d mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables
I. Background: Sparse Memory Mappings

When we manage sparse memory mappings dynamically in user space - also
sometimes involving MAP_NORESERVE - we want to dynamically populate/
discard memory inside such a sparse memory region.  Example users are
hypervisors (especially implementing memory ballooning or similar
technologies like virtio-mem) and memory allocators.  In addition, we want
to fail in a nice way (instead of generating SIGBUS) if populating does
not succeed because we are out of backend memory (which can happen easily
with file-based mappings, especially tmpfs and hugetlbfs).

While MADV_DONTNEED, MADV_REMOVE and FALLOC_FL_PUNCH_HOLE allow for
reliably discarding memory for most mapping types, there is no generic
approach to populate page tables and preallocate memory.

Although mmap() supports MAP_POPULATE, it is not applicable to the concept
of sparse memory mappings, where we want to populate/discard dynamically
and avoid expensive/problematic remappings.  In addition, we never
actually report errors during the final populate phase - it is best-effort
only.

fallocate() can be used to preallocate file-based memory and fail in a
safe way.  However, it cannot really be used for any private mappings on
anonymous files via memfd due to COW semantics.  In addition, fallocate()
does not actually populate page tables, so we still always get pagefaults
on first access - which is sometimes undesired (i.e., real-time workloads)
and requires real prefaulting of page tables, not just a preallocation of
backend storage.  There might be interesting use cases for sparse memory
regions along with mlockall(MCL_ONFAULT) which fallocate() cannot satisfy
as it does not prefault page tables.

II. On preallcoation/prefaulting from user space

Because we don't have a proper interface, what applications (like QEMU and
databases) end up doing is touching (i.e., reading+writing one byte to not
overwrite existing data) all individual pages.

However, that approach
1) Can result in wear on storage backing, because we end up reading/writing
   each page; this is especially a problem for dax/pmem.
2) Can result in mmap_sem contention when prefaulting via multiple
   threads.
3) Requires expensive signal handling, especially to catch SIGBUS in case
   of hugetlbfs/shmem/file-backed memory. For example, this is
   problematic in hypervisors like QEMU where SIGBUS handlers might already
   be used by other subsystems concurrently to e.g, handle hardware errors.
   "Simply" doing preallocation concurrently from other thread is not that
   easy.

III. On MADV_WILLNEED

Extending MADV_WILLNEED is not an option because
1. It would change the semantics: "Expect access in the near future." and
   "might be a good idea to read some pages" vs. "Definitely populate/
   preallocate all memory and definitely fail on errors.".
2. Existing users (like virtio-balloon in QEMU when deflating the balloon)
   don't want populate/prealloc semantics. They treat this rather as a hint
   to give a little performance boost without too much overhead - and don't
   expect that a lot of memory might get consumed or a lot of time
   might be spent.

IV. MADV_POPULATE_READ and MADV_POPULATE_WRITE

Let's introduce MADV_POPULATE_READ and MADV_POPULATE_WRITE, inspired by
MAP_POPULATE, with the following semantics:
1. MADV_POPULATE_READ can be used to prefault page tables just like
   manually reading each individual page. This will not break any COW
   mappings. The shared zero page might get mapped and no backend storage
   might get preallocated -- allocation might be deferred to
   write-fault time. Especially shared file mappings require an explicit
   fallocate() upfront to actually preallocate backend memory (blocks in
   the file system) in case the file might have holes.
2. If MADV_POPULATE_READ succeeds, all page tables have been populated
   (prefaulted) readable once.
3. MADV_POPULATE_WRITE can be used to preallocate backend memory and
   prefault page tables just like manually writing (or
   reading+writing) each individual page. This will break any COW
   mappings -- e.g., the shared zeropage is never populated.
4. If MADV_POPULATE_WRITE succeeds, all page tables have been populated
   (prefaulted) writable once.
5. MADV_POPULATE_READ and MADV_POPULATE_WRITE cannot be applied to special
   mappings marked with VM_PFNMAP and VM_IO. Also, proper access
   permissions (e.g., PROT_READ, PROT_WRITE) are required. If any such
   mapping is encountered, madvise() fails with -EINVAL.
6. If MADV_POPULATE_READ or MADV_POPULATE_WRITE fails, some page tables
   might have been populated.
7. MADV_POPULATE_READ and MADV_POPULATE_WRITE will return -EHWPOISON
   when encountering a HW poisoned page in the range.
8. Similar to MAP_POPULATE, MADV_POPULATE_READ and MADV_POPULATE_WRITE
   cannot protect from the OOM (Out Of Memory) handler killing the
   process.

While the use case for MADV_POPULATE_WRITE is fairly obvious (i.e.,
preallocate memory and prefault page tables for VMs), one issue is that
whenever we prefault pages writable, the pages have to be marked dirty,
because the CPU could dirty them any time.  while not a real problem for
hugetlbfs or dax/pmem, it can be a problem for shared file mappings: each
page will be marked dirty and has to be written back later when evicting.

MADV_POPULATE_READ allows for optimizing this scenario: Pre-read a whole
mapping from backend storage without marking it dirty, such that eviction
won't have to write it back.  As discussed above, shared file mappings
might require an explciit fallocate() upfront to achieve
preallcoation+prepopulation.

Although sparse memory mappings are the primary use case, this will also
be useful for other preallocate/prefault use cases where MAP_POPULATE is
not desired or the semantics of MAP_POPULATE are not sufficient: as one
example, QEMU users can trigger preallocation/prefaulting of guest RAM
after the mapping was created -- and don't want errors to be silently
suppressed.

Looking at the history, MADV_POPULATE was already proposed in 2013 [1],
however, the main motivation back than was performance improvements --
which should also still be the case.

V. Single-threaded performance comparison

I did a short experiment, prefaulting page tables on completely *empty
mappings/files* and repeated the experiment 10 times.  The results
correspond to the shortest execution time.  In general, the performance
benefit for huge pages is negligible with small mappings.

V.1: Private mappings

POPULATE_READ and POPULATE_WRITE is fastest.  Note that
Reading/POPULATE_READ will populate the shared zeropage where applicable
-- which result in short population times.

The fastest way to allocate backend storage (here: swap or huge pages) and
prefault page tables is POPULATE_WRITE.

V.2: Shared mappings

fallocate() is fastest, however, doesn't prefault page tables.
POPULATE_WRITE is faster than simple writes and read/writes.
POPULATE_READ is faster than simple reads.

Without a fd, the fastest way to allocate backend storage and prefault
page tables is POPULATE_WRITE.  With an fd, the fastest way is usually
FALLOCATE+POPULATE_READ or FALLOCATE+POPULATE_WRITE respectively; one
exception are actual files: FALLOCATE+Read is slightly faster than
FALLOCATE+POPULATE_READ.

The fastest way to allocate backend storage prefault page tables is
FALLOCATE+POPULATE_WRITE -- except when dealing with actual files; then,
FALLOCATE+POPULATE_READ is fastest and won't directly mark all pages as
dirty.

v.3: Detailed results

==================================================
2 MiB MAP_PRIVATE:
**************************************************
Anon 4 KiB     : Read                     :     0.119 ms
Anon 4 KiB     : Write                    :     0.222 ms
Anon 4 KiB     : Read/Write               :     0.380 ms
Anon 4 KiB     : POPULATE_READ            :     0.060 ms
Anon 4 KiB     : POPULATE_WRITE           :     0.158 ms
Memfd 4 KiB    : Read                     :     0.034 ms
Memfd 4 KiB    : Write                    :     0.310 ms
Memfd 4 KiB    : Read/Write               :     0.362 ms
Memfd 4 KiB    : POPULATE_READ            :     0.039 ms
Memfd 4 KiB    : POPULATE_WRITE           :     0.229 ms
Memfd 2 MiB    : Read                     :     0.030 ms
Memfd 2 MiB    : Write                    :     0.030 ms
Memfd 2 MiB    : Read/Write               :     0.030 ms
Memfd 2 MiB    : POPULATE_READ            :     0.030 ms
Memfd 2 MiB    : POPULATE_WRITE           :     0.030 ms
tmpfs          : Read                     :     0.033 ms
tmpfs          : Write                    :     0.313 ms
tmpfs          : Read/Write               :     0.406 ms
tmpfs          : POPULATE_READ            :     0.039 ms
tmpfs          : POPULATE_WRITE           :     0.285 ms
file           : Read                     :     0.033 ms
file           : Write                    :     0.351 ms
file           : Read/Write               :     0.408 ms
file           : POPULATE_READ            :     0.039 ms
file           : POPULATE_WRITE           :     0.290 ms
hugetlbfs      : Read                     :     0.030 ms
hugetlbfs      : Write                    :     0.030 ms
hugetlbfs      : Read/Write               :     0.030 ms
hugetlbfs      : POPULATE_READ            :     0.030 ms
hugetlbfs      : POPULATE_WRITE           :     0.030 ms
**************************************************
4096 MiB MAP_PRIVATE:
**************************************************
Anon 4 KiB     : Read                     :   237.940 ms
Anon 4 KiB     : Write                    :   708.409 ms
Anon 4 KiB     : Read/Write               :  1054.041 ms
Anon 4 KiB     : POPULATE_READ            :   124.310 ms
Anon 4 KiB     : POPULATE_WRITE           :   572.582 ms
Memfd 4 KiB    : Read                     :   136.928 ms
Memfd 4 KiB    : Write                    :   963.898 ms
Memfd 4 KiB    : Read/Write               :  1106.561 ms
Memfd 4 KiB    : POPULATE_READ            :    78.450 ms
Memfd 4 KiB    : POPULATE_WRITE           :   805.881 ms
Memfd 2 MiB    : Read                     :   357.116 ms
Memfd 2 MiB    : Write                    :   357.210 ms
Memfd 2 MiB    : Read/Write               :   357.606 ms
Memfd 2 MiB    : POPULATE_READ            :   356.094 ms
Memfd 2 MiB    : POPULATE_WRITE           :   356.937 ms
tmpfs          : Read                     :   137.536 ms
tmpfs          : Write                    :   954.362 ms
tmpfs          : Read/Write               :  1105.954 ms
tmpfs          : POPULATE_READ            :    80.289 ms
tmpfs          : POPULATE_WRITE           :   822.826 ms
file           : Read                     :   137.874 ms
file           : Write                    :   987.025 ms
file           : Read/Write               :  1107.439 ms
file           : POPULATE_READ            :    80.413 ms
file           : POPULATE_WRITE           :   857.622 ms
hugetlbfs      : Read                     :   355.607 ms
hugetlbfs      : Write                    :   355.729 ms
hugetlbfs      : Read/Write               :   356.127 ms
hugetlbfs      : POPULATE_READ            :   354.585 ms
hugetlbfs      : POPULATE_WRITE           :   355.138 ms
**************************************************
2 MiB MAP_SHARED:
**************************************************
Anon 4 KiB     : Read                     :     0.394 ms
Anon 4 KiB     : Write                    :     0.348 ms
Anon 4 KiB     : Read/Write               :     0.400 ms
Anon 4 KiB     : POPULATE_READ            :     0.326 ms
Anon 4 KiB     : POPULATE_WRITE           :     0.273 ms
Anon 2 MiB     : Read                     :     0.030 ms
Anon 2 MiB     : Write                    :     0.030 ms
Anon 2 MiB     : Read/Write               :     0.030 ms
Anon 2 MiB     : POPULATE_READ            :     0.030 ms
Anon 2 MiB     : POPULATE_WRITE           :     0.030 ms
Memfd 4 KiB    : Read                     :     0.412 ms
Memfd 4 KiB    : Write                    :     0.372 ms
Memfd 4 KiB    : Read/Write               :     0.419 ms
Memfd 4 KiB    : POPULATE_READ            :     0.343 ms
Memfd 4 KiB    : POPULATE_WRITE           :     0.288 ms
Memfd 4 KiB    : FALLOCATE                :     0.137 ms
Memfd 4 KiB    : FALLOCATE+Read           :     0.446 ms
Memfd 4 KiB    : FALLOCATE+Write          :     0.330 ms
Memfd 4 KiB    : FALLOCATE+Read/Write     :     0.454 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_READ  :     0.379 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_WRITE :     0.268 ms
Memfd 2 MiB    : Read                     :     0.030 ms
Memfd 2 MiB    : Write                    :     0.030 ms
Memfd 2 MiB    : Read/Write               :     0.030 ms
Memfd 2 MiB    : POPULATE_READ            :     0.030 ms
Memfd 2 MiB    : POPULATE_WRITE           :     0.030 ms
Memfd 2 MiB    : FALLOCATE                :     0.030 ms
Memfd 2 MiB    : FALLOCATE+Read           :     0.031 ms
Memfd 2 MiB    : FALLOCATE+Write          :     0.031 ms
Memfd 2 MiB    : FALLOCATE+Read/Write     :     0.031 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_READ  :     0.030 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_WRITE :     0.030 ms
tmpfs          : Read                     :     0.416 ms
tmpfs          : Write                    :     0.369 ms
tmpfs          : Read/Write               :     0.425 ms
tmpfs          : POPULATE_READ            :     0.346 ms
tmpfs          : POPULATE_WRITE           :     0.295 ms
tmpfs          : FALLOCATE                :     0.139 ms
tmpfs          : FALLOCATE+Read           :     0.447 ms
tmpfs          : FALLOCATE+Write          :     0.333 ms
tmpfs          : FALLOCATE+Read/Write     :     0.454 ms
tmpfs          : FALLOCATE+POPULATE_READ  :     0.380 ms
tmpfs          : FALLOCATE+POPULATE_WRITE :     0.272 ms
file           : Read                     :     0.191 ms
file           : Write                    :     0.511 ms
file           : Read/Write               :     0.524 ms
file           : POPULATE_READ            :     0.196 ms
file           : POPULATE_WRITE           :     0.434 ms
file           : FALLOCATE                :     0.004 ms
file           : FALLOCATE+Read           :     0.197 ms
file           : FALLOCATE+Write          :     0.554 ms
file           : FALLOCATE+Read/Write     :     0.480 ms
file           : FALLOCATE+POPULATE_READ  :     0.201 ms
file           : FALLOCATE+POPULATE_WRITE :     0.381 ms
hugetlbfs      : Read                     :     0.030 ms
hugetlbfs      : Write                    :     0.030 ms
hugetlbfs      : Read/Write               :     0.030 ms
hugetlbfs      : POPULATE_READ            :     0.030 ms
hugetlbfs      : POPULATE_WRITE           :     0.030 ms
hugetlbfs      : FALLOCATE                :     0.030 ms
hugetlbfs      : FALLOCATE+Read           :     0.031 ms
hugetlbfs      : FALLOCATE+Write          :     0.031 ms
hugetlbfs      : FALLOCATE+Read/Write     :     0.030 ms
hugetlbfs      : FALLOCATE+POPULATE_READ  :     0.030 ms
hugetlbfs      : FALLOCATE+POPULATE_WRITE :     0.030 ms
**************************************************
4096 MiB MAP_SHARED:
**************************************************
Anon 4 KiB     : Read                     :  1053.090 ms
Anon 4 KiB     : Write                    :   913.642 ms
Anon 4 KiB     : Read/Write               :  1060.350 ms
Anon 4 KiB     : POPULATE_READ            :   893.691 ms
Anon 4 KiB     : POPULATE_WRITE           :   782.885 ms
Anon 2 MiB     : Read                     :   358.553 ms
Anon 2 MiB     : Write                    :   358.419 ms
Anon 2 MiB     : Read/Write               :   357.992 ms
Anon 2 MiB     : POPULATE_READ            :   357.533 ms
Anon 2 MiB     : POPULATE_WRITE           :   357.808 ms
Memfd 4 KiB    : Read                     :  1078.144 ms
Memfd 4 KiB    : Write                    :   942.036 ms
Memfd 4 KiB    : Read/Write               :  1100.391 ms
Memfd 4 KiB    : POPULATE_READ            :   925.829 ms
Memfd 4 KiB    : POPULATE_WRITE           :   804.394 ms
Memfd 4 KiB    : FALLOCATE                :   304.632 ms
Memfd 4 KiB    : FALLOCATE+Read           :  1163.359 ms
Memfd 4 KiB    : FALLOCATE+Write          :   933.186 ms
Memfd 4 KiB    : FALLOCATE+Read/Write     :  1187.304 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_READ  :  1013.660 ms
Memfd 4 KiB    : FALLOCATE+POPULATE_WRITE :   794.560 ms
Memfd 2 MiB    : Read                     :   358.131 ms
Memfd 2 MiB    : Write                    :   358.099 ms
Memfd 2 MiB    : Read/Write               :   358.250 ms
Memfd 2 MiB    : POPULATE_READ            :   357.563 ms
Memfd 2 MiB    : POPULATE_WRITE           :   357.334 ms
Memfd 2 MiB    : FALLOCATE                :   356.735 ms
Memfd 2 MiB    : FALLOCATE+Read           :   358.152 ms
Memfd 2 MiB    : FALLOCATE+Write          :   358.331 ms
Memfd 2 MiB    : FALLOCATE+Read/Write     :   358.018 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_READ  :   357.286 ms
Memfd 2 MiB    : FALLOCATE+POPULATE_WRITE :   357.523 ms
tmpfs          : Read                     :  1087.265 ms
tmpfs          : Write                    :   950.840 ms
tmpfs          : Read/Write               :  1107.567 ms
tmpfs          : POPULATE_READ            :   922.605 ms
tmpfs          : POPULATE_WRITE           :   810.094 ms
tmpfs          : FALLOCATE                :   306.320 ms
tmpfs          : FALLOCATE+Read           :  1169.796 ms
tmpfs          : FALLOCATE+Write          :   933.730 ms
tmpfs          : FALLOCATE+Read/Write     :  1191.610 ms
tmpfs          : FALLOCATE+POPULATE_READ  :  1020.474 ms
tmpfs          : FALLOCATE+POPULATE_WRITE :   798.945 ms
file           : Read                     :   654.101 ms
file           : Write                    :  1259.142 ms
file           : Read/Write               :  1289.509 ms
file           : POPULATE_READ            :   661.642 ms
file           : POPULATE_WRITE           :  1106.816 ms
file           : FALLOCATE                :     1.864 ms
file           : FALLOCATE+Read           :   656.328 ms
file           : FALLOCATE+Write          :  1153.300 ms
file           : FALLOCATE+Read/Write     :  1180.613 ms
file           : FALLOCATE+POPULATE_READ  :   668.347 ms
file           : FALLOCATE+POPULATE_WRITE :   996.143 ms
hugetlbfs      : Read                     :   357.245 ms
hugetlbfs      : Write                    :   357.413 ms
hugetlbfs      : Read/Write               :   357.120 ms
hugetlbfs      : POPULATE_READ            :   356.321 ms
hugetlbfs      : POPULATE_WRITE           :   356.693 ms
hugetlbfs      : FALLOCATE                :   355.927 ms
hugetlbfs      : FALLOCATE+Read           :   357.074 ms
hugetlbfs      : FALLOCATE+Write          :   357.120 ms
hugetlbfs      : FALLOCATE+Read/Write     :   356.983 ms
hugetlbfs      : FALLOCATE+POPULATE_READ  :   356.413 ms
hugetlbfs      : FALLOCATE+POPULATE_WRITE :   356.266 ms
**************************************************

[1] https://lkml.org/lkml/2013/6/27/698

[akpm@linux-foundation.org: coding style fixes]

Link: https://lkml.kernel.org/r/20210419135443.12822-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:30 -07:00
Kefeng Wang
63703f37aa mm: generalize ZONE_[DMA|DMA32]
ZONE_[DMA|DMA32] configs have duplicate definitions on platforms that
subscribe to them.  Instead, just make them generic options which can be
selected on applicable platforms.

Also only x86/arm64 architectures could enable both ZONE_DMA and
ZONE_DMA32 if EXPERT, add ARCH_HAS_ZONE_DMA_SET to make dma zone
configurable and visible on the two architectures.

Link: https://lkml.kernel.org/r/20210528074557.17768-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>	[RISC-V]
Acked-by: Michal Simek <michal.simek@xilinx.com>	[microblaze]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:30 -07:00
Kefeng Wang
781eb2cdd2 mm/kconfig: move HOLES_IN_ZONE into mm
commit a55749639dc1 ("ia64: drop marked broken DISCONTIGMEM and
VIRTUAL_MEM_MAP") drop VIRTUAL_MEM_MAP, so there is no need HOLES_IN_ZONE
on ia64.

Also move HOLES_IN_ZONE into mm/Kconfig, select it if architecture needs
this feature.

Link: https://lkml.kernel.org/r/20210417075946.181402-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:28 -07:00
Linus Torvalds
dbe69e4337 Networking changes for 5.14.
Core:
 
  - BPF:
    - add syscall program type and libbpf support for generating
      instructions and bindings for in-kernel BPF loaders (BPF loaders
      for BPF), this is a stepping stone for signed BPF programs
    - infrastructure to migrate TCP child sockets from one listener
      to another in the same reuseport group/map to improve flexibility
      of service hand-off/restart
    - add broadcast support to XDP redirect
 
  - allow bypass of the lockless qdisc to improving performance
    (for pktgen: +23% with one thread, +44% with 2 threads)
 
  - add a simpler version of "DO_ONCE()" which does not require
    jump labels, intended for slow-path usage
 
  - virtio/vsock: introduce SOCK_SEQPACKET support
 
  - add getsocketopt to retrieve netns cookie
 
  - ip: treat lowest address of a IPv4 subnet as ordinary unicast address
        allowing reclaiming of precious IPv4 addresses
 
  - ipv6: use prandom_u32() for ID generation
 
  - ip: add support for more flexible field selection for hashing
        across multi-path routes (w/ offload to mlxsw)
 
  - icmp: add support for extended RFC 8335 PROBE (ping)
 
  - seg6: add support for SRv6 End.DT46 behavior
 
  - mptcp:
     - DSS checksum support (RFC 8684) to detect middlebox meddling
     - support Connection-time 'C' flag
     - time stamping support
 
  - sctp: packetization Layer Path MTU Discovery (RFC 8899)
 
  - xfrm: speed up state addition with seq set
 
  - WiFi:
     - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
     - aggregation handling improvements for some drivers
     - minstrel improvements for no-ack frames
     - deferred rate control for TXQs to improve reaction times
     - switch from round robin to virtual time-based airtime scheduler
 
  - add trace points:
     - tcp checksum errors
     - openvswitch - action execution, upcalls
     - socket errors via sk_error_report
 
 Device APIs:
 
  - devlink: add rate API for hierarchical control of max egress rate
             of virtual devices (VFs, SFs etc.)
 
  - don't require RCU read lock to be held around BPF hooks
    in NAPI context
 
  - page_pool: generic buffer recycling
 
 New hardware/drivers:
 
  - mobile:
     - iosm: PCIe Driver for Intel M.2 Modem
     - support for Qualcomm MSM8998 (ipa)
 
  - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
 
  - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
 
  - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
 
  - NXP SJA1110 Automotive Ethernet 10-port switch
 
  - Qualcomm QCA8327 switch support (qca8k)
 
  - Mikrotik 10/25G NIC (atl1c)
 
 Driver changes:
 
  - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP
    (our first foray into MAC/PHY description via ACPI)
 
  - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
 
  - Mellanox/Nvidia NIC (mlx5)
    - NIC VF offload of L2 bridging
    - support IRQ distribution to Sub-functions
 
  - Marvell (prestera):
     - add flower and match all
     - devlink trap
     - link aggregation
 
  - Netronome (nfp): connection tracking offload
 
  - Intel 1GE (igc): add AF_XDP support
 
  - Marvell DPU (octeontx2): ingress ratelimit offload
 
  - Google vNIC (gve): new ring/descriptor format support
 
  - Qualcomm mobile (rmnet & ipa): inline checksum offload support
 
  - MediaTek WiFi (mt76)
     - mt7915 MSI support
     - mt7915 Tx status reporting
     - mt7915 thermal sensors support
     - mt7921 decapsulation offload
     - mt7921 enable runtime pm and deep sleep
 
  - Realtek WiFi (rtw88)
     - beacon filter support
     - Tx antenna path diversity support
     - firmware crash information via devcoredump
 
  - Qualcomm 60GHz WiFi (wcn36xx)
     - Wake-on-WLAN support with magic packets and GTK rekeying
 
  - Micrel PHY (ksz886x/ksz8081): add cable test support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmDb+fUACgkQMUZtbf5S
 Irs2Jg//aqN0Q8CgIvYCVhPxQw1tY7pTAbgyqgBZ01vwjyvtIOgJiWzSfFEU84mX
 M8fcpFX5eTKrOyJ9S6UFfQ/JG114n3hjAxFFT4Hxk2gC1Tg0vHuFQTDHcUl28bUE
 mTm61e1YpdorILnv2k5JVQ/wu0vs5QKDrjcYcrcPnh+j93wvnPOgAfDBV95nZzjS
 OTt4q2fR8GzLcSYWWsclMbDNkzyTG50RW/0Yd6aGjr5QGvXfrMeXfUJNz533PMf/
 w5lNyjRKv+x9mdTZJzU0+msNUrZgUdRz7W8Ey8lD3hJZRE+D6/uU7FtsE8Mi3+uc
 HWxeZUyzA3YF1MfVl/eesbxyPT7S/OkLzk4O5B35FbqP0YltaP+bOjq1/nM3ce1/
 io9Dx9pIl/2JANUgRCAtLi8Z2dkvRoqTaBxZ/nPudCCljFwDwl6joTMJ7Ow22i5Y
 5aIkcXFmZq4LbJDiHvbTlqT7yiuaEvu2UK/23bSIg/K3nF4eAmkY9Y1EgiMf60OF
 78Ttw0wk2tUegwaS5MZnCniKBKDyl9gM2F6rbZ/IxQRR2LTXFc1B6gC+ynUxgXfh
 Ub8O++6qGYGYZ0XvQH4pzco79p3qQWBTK5beIp2eu6BOAjBVIXq4AibUfoQLACsu
 hX7jMPYd0kc3WFgUnKgQP8EnjFSwbf4XiaE7fIXvWBY8hzCw2h4=
 =LvtX
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - BPF:
      - add syscall program type and libbpf support for generating
        instructions and bindings for in-kernel BPF loaders (BPF loaders
        for BPF), this is a stepping stone for signed BPF programs
      - infrastructure to migrate TCP child sockets from one listener to
        another in the same reuseport group/map to improve flexibility
        of service hand-off/restart
      - add broadcast support to XDP redirect

   - allow bypass of the lockless qdisc to improving performance (for
     pktgen: +23% with one thread, +44% with 2 threads)

   - add a simpler version of "DO_ONCE()" which does not require jump
     labels, intended for slow-path usage

   - virtio/vsock: introduce SOCK_SEQPACKET support

   - add getsocketopt to retrieve netns cookie

   - ip: treat lowest address of a IPv4 subnet as ordinary unicast
     address allowing reclaiming of precious IPv4 addresses

   - ipv6: use prandom_u32() for ID generation

   - ip: add support for more flexible field selection for hashing
     across multi-path routes (w/ offload to mlxsw)

   - icmp: add support for extended RFC 8335 PROBE (ping)

   - seg6: add support for SRv6 End.DT46 behavior

   - mptcp:
      - DSS checksum support (RFC 8684) to detect middlebox meddling
      - support Connection-time 'C' flag
      - time stamping support

   - sctp: packetization Layer Path MTU Discovery (RFC 8899)

   - xfrm: speed up state addition with seq set

   - WiFi:
      - hidden AP discovery on 6 GHz and other HE 6 GHz improvements
      - aggregation handling improvements for some drivers
      - minstrel improvements for no-ack frames
      - deferred rate control for TXQs to improve reaction times
      - switch from round robin to virtual time-based airtime scheduler

   - add trace points:
      - tcp checksum errors
      - openvswitch - action execution, upcalls
      - socket errors via sk_error_report

  Device APIs:

   - devlink: add rate API for hierarchical control of max egress rate
     of virtual devices (VFs, SFs etc.)

   - don't require RCU read lock to be held around BPF hooks in NAPI
     context

   - page_pool: generic buffer recycling

  New hardware/drivers:

   - mobile:
      - iosm: PCIe Driver for Intel M.2 Modem
      - support for Qualcomm MSM8998 (ipa)

   - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices

   - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches

   - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)

   - NXP SJA1110 Automotive Ethernet 10-port switch

   - Qualcomm QCA8327 switch support (qca8k)

   - Mikrotik 10/25G NIC (atl1c)

  Driver changes:

   - ACPI support for some MDIO, MAC and PHY devices from Marvell and
     NXP (our first foray into MAC/PHY description via ACPI)

   - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx

   - Mellanox/Nvidia NIC (mlx5)
      - NIC VF offload of L2 bridging
      - support IRQ distribution to Sub-functions

   - Marvell (prestera):
      - add flower and match all
      - devlink trap
      - link aggregation

   - Netronome (nfp): connection tracking offload

   - Intel 1GE (igc): add AF_XDP support

   - Marvell DPU (octeontx2): ingress ratelimit offload

   - Google vNIC (gve): new ring/descriptor format support

   - Qualcomm mobile (rmnet & ipa): inline checksum offload support

   - MediaTek WiFi (mt76)
      - mt7915 MSI support
      - mt7915 Tx status reporting
      - mt7915 thermal sensors support
      - mt7921 decapsulation offload
      - mt7921 enable runtime pm and deep sleep

   - Realtek WiFi (rtw88)
      - beacon filter support
      - Tx antenna path diversity support
      - firmware crash information via devcoredump

   - Qualcomm WiFi (wcn36xx)
      - Wake-on-WLAN support with magic packets and GTK rekeying

   - Micrel PHY (ksz886x/ksz8081): add cable test support"

* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
  tcp: change ICSK_CA_PRIV_SIZE definition
  tcp_yeah: check struct yeah size at compile time
  gve: DQO: Fix off by one in gve_rx_dqo()
  stmmac: intel: set PCI_D3hot in suspend
  stmmac: intel: Enable PHY WOL option in EHL
  net: stmmac: option to enable PHY WOL with PMT enabled
  net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
  net: use netdev_info in ndo_dflt_fdb_{add,del}
  ptp: Set lookup cookie when creating a PTP PPS source.
  net: sock: add trace for socket errors
  net: sock: introduce sk_error_report
  net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
  net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
  net: dsa: include fdb entries pointing to bridge in the host fdb list
  net: dsa: include bridge addresses which are local in the host fdb list
  net: dsa: sync static FDB entries on foreign interfaces to hardware
  net: dsa: install the host MDB and FDB entries in the master's RX filter
  net: dsa: reference count the FDB addresses at the cross-chip notifier level
  net: dsa: introduce a separate cross-chip notifier type for host FDBs
  net: dsa: reference count the MDB entries at the cross-chip notifier level
  ...
2021-06-30 15:51:09 -07:00
Wei Li
cf02ce742f MIPS: Fix PKMAP with 32-bit MIPS huge page support
When 32-bit MIPS huge page support is enabled, we halve the number of
pointers a PTE page holds, making its last half go to waste.
Correspondingly, we should halve the number of kmap entries, as we just
initialized only a single pte table for that in pagetable_init().

Fixes: 35476311e5 ("MIPS: Add partial 32-bit huge page support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-30 14:41:32 +02:00
周琰杰 (Zhou Yanjie)
34c522a07c MIPS: CI20: Add second percpu timer for SMP.
1.Add a new TCU channel as the percpu timer of core1, this is to
  prepare for the subsequent SMP support. The newly added channel
  will not adversely affect the current single-core state.
2.Adjust the position of TCU node to make it consistent with the
  order in jz4780.dtsi file.

Tested-by: Nikolaus Schaller <hns@goldelico.com> # on CI20
Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-30 14:37:16 +02:00
周琰杰 (Zhou Yanjie)
23c64447b3 MIPS: CI20: Reduce clocksource to 750 kHz.
The original clock (3 MHz) is too fast for the clocksource,
there will be a chance that the system may get stuck.

Reported-by: Nikolaus Schaller <hns@goldelico.com>
Tested-by: Nikolaus Schaller <hns@goldelico.com> # on CI20
Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-30 14:37:16 +02:00
周琰杰 (Zhou Yanjie)
ab3040e137 MIPS: Ingenic: Add MAC syscon nodes for Ingenic SoCs.
Add MAC syscon nodes for X1000 SoC and X1830 SoC from Ingenic.

Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-30 14:37:16 +02:00
周琰杰 (Zhou Yanjie)
579f73cf84 MIPS: X1830: Respect cell count of common properties.
If N fields of X cells should be provided, then that's what the
devicetree should represent, instead of having one single field of
(N * X) cells.

Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-30 14:37:16 +02:00
Linus Torvalds
65090f30ab Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "191 patches.

  Subsystems affected by this patch series: kthread, ia64, scripts,
  ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab,
  slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap,
  mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization,
  pagealloc, and memory-failure)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits)
  mm,hwpoison: make get_hwpoison_page() call get_any_page()
  mm,hwpoison: send SIGBUS with error virutal address
  mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes
  mm/page_alloc: allow high-order pages to be stored on the per-cpu lists
  mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM
  mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
  docs: remove description of DISCONTIGMEM
  arch, mm: remove stale mentions of DISCONIGMEM
  mm: remove CONFIG_DISCONTIGMEM
  m68k: remove support for DISCONTIGMEM
  arc: remove support for DISCONTIGMEM
  arc: update comment about HIGHMEM implementation
  alpha: remove DISCONTIGMEM and NUMA
  mm/page_alloc: move free_the_page
  mm/page_alloc: fix counting of managed_pages
  mm/page_alloc: improve memmap_pages dbg msg
  mm: drop SECTION_SHIFT in code comments
  mm/page_alloc: introduce vm.percpu_pagelist_high_fraction
  mm/page_alloc: limit the number of pages on PCP lists when reclaim is active
  mm/page_alloc: scale the number of pages that are batch freed
  ...
2021-06-29 17:29:11 -07:00
Linus Torvalds
21edf50948 Updates for the interrupt subsystem:
Core changes:
 
   - Cleanup and simplification of common code to invoke the low level
     interrupt flow handlers when this invocation requires irqdomain
     resolution. Add the necessary core infrastructure.
 
   - Provide a proper interface for modular PMU drivers to set the
     interrupt affinity.
 
   - Add a request flag which allows to exclude interrupts from spurious
     interrupt detection. Useful especially for IPI handlers which always
     return IRQ_HANDLED which turns the spurious interrupt detection into a
     pointless waste of CPU cycles.
 
 Driver changes:
 
   - Bulk convert interrupt chip drivers to the new irqdomain low level flow
     handler invocation mechanism.
 
   - Add device tree bindings for the Renesas R-Car M3-W+ SoC
 
   - Enable modular build of the Qualcomm PDC driver
 
   - The usual small fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmDbIg8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobZNEAC2wTq3Ishk026va7g5mbQVSvAQyf8G
 0msmgJ48lJWVL9a6JUogNcCO7sZCTcAy4CYbuHI6kz1fGZZnNWSCrtEz0rFNAdWE
 WVR2k8ExR2R73vJm+K50WUMMj8YsefRnIFXWlJdTp+pksr3TZ7Lo70taGUK/6tMo
 aL0dqvnf7Vb3LG0iIkaHWLF4HnyK/UGqB+121rlL4UhI1/g+3EUxNWNcY5eg/dmc
 Ym73U1uDsjydp3/3jm8v8NYNtwCDGpujZZc/88RFLjP6PMpF1S9JUvDEt+LHJi0a
 cdS3RreB78HYXpLg5NtDFJwIegRMLSitvCGPBjHvWBzbifkMsA2zWIb6Cs8VkYys
 vuPoEGZ0ol+SWvcnSh5Xy36nyr4iGIBhQql47UAaqelSxsYPjvCCSD4yJV3k8hnC
 ZuDscOekXUMn75qZR0quNdi1SkgKpGZxK73QFbuW3Apl5EgArVai6kq0rbl6zlx6
 ACy0SEcevhOcpU6WpqDgrmUBgFr+M8zina8edRELgiFEuWT6pYxKwrN3pT4U5djO
 e5V3YuNzzwzvtUoXN4AiTlT8gwRiGfgeiEvHpvZBXPNvk5ffS6XzPiV81ZMWiBkb
 ReoCbqME3PKoxj1VAHJvVXHbcjiPIJeCRdV+5vQSNh1SPSQOmEdWyJtNUDrSkoym
 QkKKY5jrOhPhlQ==
 =FIKh
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates for the interrupt subsystem:

  Core changes:

   - Cleanup and simplification of common code to invoke the low level
     interrupt flow handlers when this invocation requires irqdomain
     resolution. Add the necessary core infrastructure.

   - Provide a proper interface for modular PMU drivers to set the
     interrupt affinity.

   - Add a request flag which allows to exclude interrupts from spurious
     interrupt detection. Useful especially for IPI handlers which
     always return IRQ_HANDLED which turns the spurious interrupt
     detection into a pointless waste of CPU cycles.

  Driver changes:

   - Bulk convert interrupt chip drivers to the new irqdomain low level
     flow handler invocation mechanism.

   - Add device tree bindings for the Renesas R-Car M3-W+ SoC

   - Enable modular build of the Qualcomm PDC driver

   - The usual small fixes and improvements"

* tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  dt-bindings: interrupt-controller: arm,gic-v3: Describe GICv3 optional properties
  irqchip: gic-pm: Remove redundant error log of clock bulk
  irqchip/sun4i: Remove unnecessary oom message
  irqchip/irq-imx-gpcv2: Remove unnecessary oom message
  irqchip/imgpdc: Remove unnecessary oom message
  irqchip/gic-v3-its: Remove unnecessary oom message
  irqchip/gic-v2m: Remove unnecessary oom message
  irqchip/exynos-combiner: Remove unnecessary oom message
  irqchip: Bulk conversion to generic_handle_domain_irq()
  genirq: Move non-irqdomain handle_domain_irq() handling into ARM's handle_IRQ()
  genirq: Add generic_handle_domain_irq() helper
  irqchip/nvic: Convert from handle_IRQ() to handle_domain_irq()
  irqdesc: Fix __handle_domain_irq() comment
  genirq: Use irq_resolve_mapping() to implement __handle_domain_irq() and co
  irqdomain: Introduce irq_resolve_mapping()
  irqdomain: Protect the linear revmap with RCU
  irqdomain: Cache irq_data instead of a virq number in the revmap
  irqdomain: Use struct_size() helper when allocating irqdomain
  irqdomain: Make normal and nomap irqdomains exclusive
  powerpc: Move the use of irq_domain_add_nomap() behind a config option
  ...
2021-06-29 12:25:04 -07:00
Mike Rapoport
a9ee6cf5c6 mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA
configuration options are equivalent.

Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead.

Done with

	$ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \
		$(git grep -wl CONFIG_NEED_MULTIPLE_NODES)
	$ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \
		$(git grep -wl NEED_MULTIPLE_NODES)

with manual tweaks afterwards.

[rppt@linux.ibm.com: fix arm boot crash]
  Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com

Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29 10:53:55 -07:00
Mike Rapoport
d3c251ab95 arch, mm: remove stale mentions of DISCONIGMEM
There are several places that mention DISCONIGMEM in comments or have
stale code guarded by CONFIG_DISCONTIGMEM.

Remove the dead code and update the comments.

Link: https://lkml.kernel.org/r/20210608091316.3622-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29 10:53:55 -07:00
Liam Howlett
7f7020ac0d arch/mips/kernel/traps: use vma_lookup() instead of find_vma()
Use vma_lookup() to find the VMA at a specific address.  As vma_lookup()
will return NULL if the address is not within any VMA, the start address
no longer needs to be validated.

Link: https://lkml.kernel.org/r/20210521174745.2219620-8-Liam.Howlett@Oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29 10:53:51 -07:00
Nick Desaulniers
c994a3ec7e MIPS: set mips32r5 for virt extensions
Clang's integrated assembler only accepts these instructions when the
cpu is set to mips32r5. With this change, we can assemble
malta_defconfig with Clang via `make LLVM_IAS=1`.

Link: https://github.com/ClangBuiltLinux/linux/issues/763
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-29 10:23:57 +02:00
zhanglianjie
6817c94443 MIPS: loongsoon64: Reserve memory below starting pfn to prevent Oops
The cause of the problem is as follows:
1. when cat /sys/devices/system/memory/memory0/valid_zones,
   test_pages_in_a_zone() will be called.
2. test_pages_in_a_zone() finds the zone according to stat_pfn = 0.
   The smallest pfn of the numa node in the mips architecture is 128,
   and the page corresponding to the previous 0~127 pfn is not
   initialized (page->flags is 0xFFFFFFFF)
3. The nid and zonenum obtained using page_zone(pfn_to_page(0)) are out
   of bounds in the corresponding array,
   &NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)],
   access to the out-of-bounds zone member variables appear abnormal,
   resulting in Oops.
Therefore, it is necessary to keep the page between 0 and the minimum
pfn to prevent Oops from appearing.

Signed-off-by: zhanglianjie <zhanglianjie@uniontech.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-29 10:15:46 +02:00
Paul Cercueil
cad065ed8d MIPS: MT extensions are not available on MIPS32r1
MIPS MT extensions were added with the MIPS 34K processor, which was
based on the MIPS32r2 ISA.

This fixes a build error when building a generic kernel for a MIPS32r1
CPU.

Fixes: c434b9f80b ("MIPS: Kconfig: add MIPS_GENERIC_KERNEL symbol")
Cc: stable@vger.kernel.org # v5.9
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-29 10:14:58 +02:00
Linus Torvalds
36824f198c ARM:
- Add MTE support in guests, complete with tag save/restore interface
 
 - Reduce the impact of CMOs by moving them in the page-table code
 
 - Allow device block mappings at stage-2
 
 - Reduce the footprint of the vmemmap in protected mode
 
 - Support the vGIC on dumb systems such as the Apple M1
 
 - Add selftest infrastructure to support multiple configuration
   and apply that to PMU/non-PMU setups
 
 - Add selftests for the debug architecture
 
 - The usual crop of PMU fixes
 
 PPC:
 
 - Support for the H_RPT_INVALIDATE hypercall
 
 - Conversion of Book3S entry/exit to C
 
 - Bug fixes
 
 S390:
 
 - new HW facilities for guests
 
 - make inline assembly more robust with KASAN and co
 
 x86:
 
 - Allow userspace to handle emulation errors (unknown instructions)
 
 - Lazy allocation of the rmap (host physical -> guest physical address)
 
 - Support for virtualizing TSC scaling on VMX machines
 
 - Optimizations to avoid shattering huge pages at the beginning of live migration
 
 - Support for initializing the PDPTRs without loading them from memory
 
 - Many TLB flushing cleanups
 
 - Refuse to load if two-stage paging is available but NX is not (this has
   been a requirement in practice for over a year)
 
 - A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from
   CR0/CR4/EFER, using the MMU mode everywhere once it is computed
   from the CPU registers
 
 - Use PM notifier to notify the guest about host suspend or hibernate
 
 - Support for passing arguments to Hyper-V hypercalls using XMM registers
 
 - Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap on
   AMD processors
 
 - Hide Hyper-V hypercalls that are not included in the guest CPUID
 
 - Fixes for live migration of virtual machines that use the Hyper-V
   "enlightened VMCS" optimization of nested virtualization
 
 - Bugfixes (not many)
 
 Generic:
 
 - Support for retrieving statistics without debugfs
 
 - Cleanups for the KVM selftests API
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDV9UYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOIRgf/XX8fKLh24RnTOs2ldIu2AfRGVrT4
 QMrr8MxhmtukBAszk2xKvBt8/6gkUjdaIC3xqEnVjxaDaUvZaEtP7CQlF5JV45rn
 iv1zyxUKucXrnIOr+gCioIT7qBlh207zV35ArKioP9Y83cWx9uAs22pfr6g+7RxO
 h8bJZlJbSG6IGr3voANCIb9UyjU1V/l8iEHqRwhmr/A5rARPfD7g8lfMEQeGkzX6
 +/UydX2fumB3tl8e2iMQj6vLVdSOsCkehvpHK+Z33EpkKhan7GwZ2sZ05WmXV/nY
 QLAYfD10KegoNWl5Ay4GTp4hEAIYVrRJCLC+wnLdc0U8udbfCuTC31LK4w==
 =NcRh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "This covers all architectures (except MIPS) so I don't expect any
  other feature pull requests this merge window.

  ARM:

   - Add MTE support in guests, complete with tag save/restore interface

   - Reduce the impact of CMOs by moving them in the page-table code

   - Allow device block mappings at stage-2

   - Reduce the footprint of the vmemmap in protected mode

   - Support the vGIC on dumb systems such as the Apple M1

   - Add selftest infrastructure to support multiple configuration and
     apply that to PMU/non-PMU setups

   - Add selftests for the debug architecture

   - The usual crop of PMU fixes

  PPC:

   - Support for the H_RPT_INVALIDATE hypercall

   - Conversion of Book3S entry/exit to C

   - Bug fixes

  S390:

   - new HW facilities for guests

   - make inline assembly more robust with KASAN and co

  x86:

   - Allow userspace to handle emulation errors (unknown instructions)

   - Lazy allocation of the rmap (host physical -> guest physical
     address)

   - Support for virtualizing TSC scaling on VMX machines

   - Optimizations to avoid shattering huge pages at the beginning of
     live migration

   - Support for initializing the PDPTRs without loading them from
     memory

   - Many TLB flushing cleanups

   - Refuse to load if two-stage paging is available but NX is not (this
     has been a requirement in practice for over a year)

   - A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from
     CR0/CR4/EFER, using the MMU mode everywhere once it is computed
     from the CPU registers

   - Use PM notifier to notify the guest about host suspend or hibernate

   - Support for passing arguments to Hyper-V hypercalls using XMM
     registers

   - Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap
     on AMD processors

   - Hide Hyper-V hypercalls that are not included in the guest CPUID

   - Fixes for live migration of virtual machines that use the Hyper-V
     "enlightened VMCS" optimization of nested virtualization

   - Bugfixes (not many)

  Generic:

   - Support for retrieving statistics without debugfs

   - Cleanups for the KVM selftests API"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits)
  KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled
  kvm: x86: disable the narrow guest module parameter on unload
  selftests: kvm: Allows userspace to handle emulation errors.
  kvm: x86: Allow userspace to handle emulation errors
  KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on
  KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault
  KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault
  KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT
  KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic
  KVM: x86: Enhance comments for MMU roles and nested transition trickiness
  KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE
  KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU
  KVM: x86/mmu: Use MMU's role to determine PTTYPE
  KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers
  KVM: x86/mmu: Add a helper to calculate root from role_regs
  KVM: x86/mmu: Add helper to update paging metadata
  KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0
  KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls
  KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper
  KVM: x86/mmu: Get nested MMU's root level from the MMU's role
  ...
2021-06-28 15:40:51 -07:00
Linus Torvalds
54a728dc5e Scheduler udpates for this cycle:
- Changes to core scheduling facilities:
 
     - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
       coordinated scheduling across SMT siblings. This is a much
       requested feature for cloud computing platforms, to allow
       the flexible utilization of SMT siblings, without exposing
       untrusted domains to information leaks & side channels, plus
       to ensure more deterministic computing performance on SMT
       systems used by heterogenous workloads.
 
       There's new prctls to set core scheduling groups, which
       allows more flexible management of workloads that can share
       siblings.
 
     - Fix task->state access anti-patterns that may result in missed
       wakeups and rename it to ->__state in the process to catch new
       abuses.
 
  - Load-balancing changes:
 
      - Tweak newidle_balance for fair-sched, to improve
        'memcache'-like workloads.
 
      - "Age" (decay) average idle time, to better track & improve workloads
        such as 'tbench'.
 
      - Fix & improve energy-aware (EAS) balancing logic & metrics.
 
      - Fix & improve the uclamp metrics.
 
      - Fix task migration (taskset) corner case on !CONFIG_CPUSET.
 
      - Fix RT and deadline utilization tracking across policy changes
 
      - Introduce a "burstable" CFS controller via cgroups, which allows
        bursty CPU-bound workloads to borrow a bit against their future
        quota to improve overall latencies & batching. Can be tweaked
        via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.
 
      - Rework assymetric topology/capacity detection & handling.
 
  - Scheduler statistics & tooling:
 
      - Disable delayacct by default, but add a sysctl to enable
        it at runtime if tooling needs it. Use static keys and
        other optimizations to make it more palatable.
 
      - Use sched_clock() in delayacct, instead of ktime_get_ns().
 
  - Misc cleanups and fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZcPoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g3yw//WfhIqy7Psa9d/MBMjQDRGbTuO4+w22Dj
 vmWFU44Q4KJxQHWeIgUlrK+dzvYWvNmflUs2CUUOiDVzxFTHMIyBtL4qCBUbx4Ns
 vKAcB9wsWZge2o3WzZqpProRhdoRaSKw8egUr2q7rACVBkckY7eGP/OjWxXU8BdA
 b7D0LPWwuIBFfN4pFYeCDLn32Dqr9s6Chyj+ZecabdG7EE6Gu+f1diVcxy7JE/mc
 4WWL0D1RqdgpGrBEuMJIxPYekdrZiuy4jtEbztz5gbTBteN1cj3BLfqn0Pc/e6rO
 Vyuc5mXCAmzRVi18z6g6bsVl+IA/nrbErENB2OHOhOYtqiZxqGTd4GPWZszMyY17
 5AsEO5+5pcaBsy4gyp09qURggBu9zhJnMVmOI3rIHZkmkhwzc6uUJlyhDCTiFWOz
 3ZF3LjbZEyCKodMD8qMHbs3axIBpIfZqjzkvSKyFnvfXEGVytVse7NUuWtQ36u92
 GnURxVeYY1TDVXvE1Y8owNKMxknKQ6YRlypP7Dtbeo/qG6hShp0xmS7qDLDi0ybZ
 ZlK+bDECiVoDf3nvJo+8v5M82IJ3CBt4UYldeRJsa1YCK/FsbK8tp91fkEfnXVue
 +U6LPX0AmMpXacR5HaZfb3uBIKRw/QMdP/7RFtBPhpV6jqCrEmuqHnpPQiEVtxwO
 UmG7bt94Trk=
 =3VDr
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler udpates from Ingo Molnar:

 - Changes to core scheduling facilities:

    - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
      coordinated scheduling across SMT siblings. This is a much
      requested feature for cloud computing platforms, to allow the
      flexible utilization of SMT siblings, without exposing untrusted
      domains to information leaks & side channels, plus to ensure more
      deterministic computing performance on SMT systems used by
      heterogenous workloads.

      There are new prctls to set core scheduling groups, which allows
      more flexible management of workloads that can share siblings.

    - Fix task->state access anti-patterns that may result in missed
      wakeups and rename it to ->__state in the process to catch new
      abuses.

 - Load-balancing changes:

    - Tweak newidle_balance for fair-sched, to improve 'memcache'-like
      workloads.

    - "Age" (decay) average idle time, to better track & improve
      workloads such as 'tbench'.

    - Fix & improve energy-aware (EAS) balancing logic & metrics.

    - Fix & improve the uclamp metrics.

    - Fix task migration (taskset) corner case on !CONFIG_CPUSET.

    - Fix RT and deadline utilization tracking across policy changes

    - Introduce a "burstable" CFS controller via cgroups, which allows
      bursty CPU-bound workloads to borrow a bit against their future
      quota to improve overall latencies & batching. Can be tweaked via
      /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.

    - Rework assymetric topology/capacity detection & handling.

 - Scheduler statistics & tooling:

    - Disable delayacct by default, but add a sysctl to enable it at
      runtime if tooling needs it. Use static keys and other
      optimizations to make it more palatable.

    - Use sched_clock() in delayacct, instead of ktime_get_ns().

 - Misc cleanups and fixes.

* tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
  sched/doc: Update the CPU capacity asymmetry bits
  sched/topology: Rework CPU capacity asymmetry detection
  sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag
  psi: Fix race between psi_trigger_create/destroy
  sched/fair: Introduce the burstable CFS controller
  sched/uclamp: Fix uclamp_tg_restrict()
  sched/rt: Fix Deadline utilization tracking during policy change
  sched/rt: Fix RT utilization tracking during policy change
  sched: Change task_struct::state
  sched,arch: Remove unused TASK_STATE offsets
  sched,timer: Use __set_current_state()
  sched: Add get_current_state()
  sched,perf,kvm: Fix preemption condition
  sched: Introduce task_is_running()
  sched: Unbreak wakeups
  sched/fair: Age the average idle time
  sched/cpufreq: Consider reduced CPU capacity in energy calculation
  sched/fair: Take thermal pressure into account while estimating energy
  thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure
  sched/fair: Return early from update_tg_cfs_load() if delta == 0
  ...
2021-06-28 12:14:19 -07:00
Linus Torvalds
28a27cbd86 Perf events updates for this cycle:
- Platform PMU driver updates:
 
      - x86 Intel uncore driver updates for Skylake (SNR) and Icelake (ICX) servers
      - Fix RDPMC support
      - Fix [extended-]PEBS-via-PT support
      - Fix Sapphire Rapids event constraints
      - Fix :ppp support on Sapphire Rapids
      - Fix fixed counter sanity check on Alder Lake & X86_FEATURE_HYBRID_CPU
      - Other heterogenous-PMU fixes
 
  - Kprobes:
 
      - Remove the unused and misguided kprobe::fault_handler callbacks.
      - Warn about kprobes taking a page fault.
      - Fix the 'nmissed' stat counter.
 
  - Misc cleanups and fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZaxMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hPgw//f9SnGzFoP1uR5TBqM8j/QHulMewew/iD
 dM5lh2emdmqHWYPBeRxUHgag38K2Golr3Y+NxLA3R+RMx+OZQe8Mz/wYvPQcBvsV
 k1HHImU3GRMn4GM7GwxH3vPIottDUx3mNS2J6pzlw3kwRUVqrxUdj/0/pSY/4eJ7
 ZT4uq4yLV83Jd3qioU7o7e/u6MrdNIIcAXRpVDdE9Mm1+kWXSVN7/h3Vsiz4tj5E
 iS+UXEtSc1a2mnmekv63pYkJHHNUb6guD8jgI/wrm1KIFGjDRifM+3TV6R/kB96/
 TfD2LhCcTShfSp8KI191pgV7/NQbB/PmLdSYmff3rTBiii4cqXuCygJCHInZ09z0
 4fTSSqM6aHg7kfTQyOCp+DUQ+9vNVXWo8mxt9c6B8xA0GyCI3zhjQ4UIiSUWRpjs
 Be5ZyF0kNNuPxYrKFnGnBf8+51DURpCz3sDdYRuK4KNkj1+4ZvJo/KzGTMUUIE4B
 IDQG6wDP5Kb388eRDtKrG5X7IXg+L5F/kezin60j0QF5MwDgxirT217teN8H1lNn
 YgWMjRK8Tw0flUJsbCxa51/nl93UtByB+fIRIc88MSeLxcI6/ORW+TxBBEqkYm5Z
 6BLFtmHSuAqAXUuyZXSGLcW7XLJvIaDoHgvbDn6l4g7FMWHqPOIq6nJQY3L8ben2
 e+fQrGh4noI=
 =20Vc
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:

 - Platform PMU driver updates:

     - x86 Intel uncore driver updates for Skylake (SNR) and Icelake (ICX) servers
     - Fix RDPMC support
     - Fix [extended-]PEBS-via-PT support
     - Fix Sapphire Rapids event constraints
     - Fix :ppp support on Sapphire Rapids
     - Fix fixed counter sanity check on Alder Lake & X86_FEATURE_HYBRID_CPU
     - Other heterogenous-PMU fixes

 - Kprobes:

     - Remove the unused and misguided kprobe::fault_handler callbacks.
     - Warn about kprobes taking a page fault.
     - Fix the 'nmissed' stat counter.

 - Misc cleanups and fixes.

* tag 'perf-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix task context PMU for Hetero
  perf/x86/intel: Fix instructions:ppp support in Sapphire Rapids
  perf/x86/intel: Add more events requires FRONTEND MSR on Sapphire Rapids
  perf/x86/intel: Fix fixed counter check warning for some Alder Lake
  perf/x86/intel: Fix PEBS-via-PT reload base value for Extended PEBS
  perf/x86: Reset the dirty counter to prevent the leak for an RDPMC task
  kprobes: Do not increment probe miss count in the fault handler
  x86,kprobes: WARN if kprobes tries to handle a fault
  kprobes: Remove kprobe::fault_handler
  uprobes: Update uprobe_write_opcode() kernel-doc comment
  perf/hw_breakpoint: Fix DocBook warnings in perf hw_breakpoint
  perf/core: Fix DocBook warnings
  perf/core: Make local function perf_pmu_snapshot_aux() static
  perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX
  perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNR
  perf/x86/intel/uncore: Generalize I/O stacks to PMON mapping procedure
  perf/x86/intel/uncore: Drop unnecessary NULL checks after container_of()
2021-06-28 12:03:20 -07:00
Linus Torvalds
a15286c63d Locking changes for this cycle:
- Core locking & atomics:
 
      - Convert all architectures to ARCH_ATOMIC: move every
        architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC
        and all the transitory facilities and #ifdefs.
 
        Much reduction in complexity from that series:
 
            63 files changed, 756 insertions(+), 4094 deletions(-)
 
      - Self-test enhancements
 
  - Futexes:
 
      - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that
        doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC).
 
        [ The temptation to repurpose FUTEX_LOCK_PI's implicit
          setting of FLAGS_CLOCKRT & invert the flag's meaning
          to avoid having to introduce a new variant was
          resisted successfully. ]
 
      - Enhance futex self-tests
 
  - Lockdep:
 
      - Fix dependency path printouts
      - Optimize trace saving
      - Broaden & fix wait-context checks
 
  - Misc cleanups and fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZaEYRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hPdxAAiNCsxL6X1cZ8zqbWsvLefT9Zqhzgs5u6
 gdZele7PNibvbYdON26b5RUzuKfOW/hgyX6LKqr+AiNYTT9PGhcY+tycUr2PGk5R
 LMyhJWmmX5cUVPU92ky+z5hEHB2gr4XPJcvgpKKUL0XB1tBaSvy2DtgwPuhXOoT1
 1sCQfy63t71snt2RfEnibVW6xovwaA2lsqL81lLHJN4iRFWvqO498/m4+PWkylsm
 ig/+VT1Oz7t4wqu3NhTqNNZv+4K4W2asniyo53Dg2BnRm/NjhJtgg4jRibrb0ssb
 67Xdq6y8+xNBmEAKj+Re8VpMcu4aj346Ctk7d4gst2ah/Rc0TvqfH6mezH7oq7RL
 hmOrMBWtwQfKhEE/fDkng30nrVxc/98YXP0n2rCCa0ySsaF6b6T185mTcYDRDxFs
 BVNS58ub+zxrF9Zd4nhIHKaEHiL2ZdDimqAicXN0RpywjIzTQ/y11uU7I1WBsKkq
 WkPYs+FPHnX7aBv1MsuxHhb8sUXjG924K4JeqnjF45jC3sC1crX+N0jv4wHw+89V
 h4k20s2Tw6m5XGXlgGwMJh0PCcD6X22Vd9Uyw8zb+IJfvNTGR9Rp1Ec+1gMRSll+
 xsn6G6Uy9bcNU0SqKlBSfelweGKn4ZxbEPn76Jc8KWLiepuZ6vv5PBoOuaujWht9
 KAeOC5XdjMk=
 =tH//
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - Core locking & atomics:

     - Convert all architectures to ARCH_ATOMIC: move every architecture
       to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the
       transitory facilities and #ifdefs.

       Much reduction in complexity from that series:

           63 files changed, 756 insertions(+), 4094 deletions(-)

     - Self-test enhancements

 - Futexes:

     - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't
       set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC).

       [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of
         FLAGS_CLOCKRT & invert the flag's meaning to avoid having to
         introduce a new variant was resisted successfully. ]

     - Enhance futex self-tests

 - Lockdep:

     - Fix dependency path printouts

     - Optimize trace saving

     - Broaden & fix wait-context checks

 - Misc cleanups and fixes.

* tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  locking/lockdep: Correct the description error for check_redundant()
  futex: Provide FUTEX_LOCK_PI2 to support clock selection
  futex: Prepare futex_lock_pi() for runtime clock selection
  lockdep/selftest: Remove wait-type RCU_CALLBACK tests
  lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING
  lockdep: Fix wait-type for empty stack
  locking/selftests: Add a selftest for check_irq_usage()
  lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage()
  locking/lockdep: Remove the unnecessary trace saving
  locking/lockdep: Fix the dep path printing for backwards BFS
  selftests: futex: Add futex compare requeue test
  selftests: futex: Add futex wait test
  seqlock: Remove trailing semicolon in macros
  locking/lockdep: Reduce LOCKDEP dependency list
  locking/lockdep,doc: Improve readability of the block matrix
  locking/atomics: atomic-instrumented: simplify ifdeffery
  locking/atomic: delete !ARCH_ATOMIC remnants
  locking/atomic: xtensa: move to ARCH_ATOMIC
  locking/atomic: sparc: move to ARCH_ATOMIC
  locking/atomic: sh: move to ARCH_ATOMIC
  ...
2021-06-28 11:45:29 -07:00
Paolo Bonzini
b8917b4ae4 KVM/arm64 updates for v5.14.
- Add MTE support in guests, complete with tag save/restore interface
 - Reduce the impact of CMOs by moving them in the page-table code
 - Allow device block mappings at stage-2
 - Reduce the footprint of the vmemmap in protected mode
 - Support the vGIC on dumb systems such as the Apple M1
 - Add selftest infrastructure to support multiple configuration
   and apply that to PMU/non-PMU setups
 - Add selftests for the debug architecture
 - The usual crop of PMU fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmDV2bEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDEr8P/ivwROx5NwGcHGmU5RfUCT3aFqhtVHHwD/lu
 jPcgoO61kz9TelOu6QRaVuK+mVHxcq3iP4R8nPq/QCkUlEXTmK2xkyhXhGXSYpH4
 6jM8+BbC3eG7iAxx6H0UM4JTl4Riwat6ZZtXpWEWs9TKqOHOQYFpMkxSttwVZ1CZ
 SjbtFvXLEdzKn6PzUWnKdBNMV/mHsdAtohZit9oJOc4ttc8072XxETQ4TFQ+MSvA
 j9zY9QPmWzgcZnotqRRu9sbTGO2vxtXuUtY3sjdD8+C9OgSe9qvpnNjymcmfwaMu
 1fBkfh65oaO4ItJBdGOUOoEcFqwN5imPiI7CB/O+ZYkO9sBCuTUPSQwPkyiwXb9r
 bUkTaQw2nZiNWsqR1x07fQ2sGYbMp5mnmgmqiV4MUWkLmFp9LZATCWYTTn24cBNS
 6SjVP6/8S0r3EhLnYjH0Pn1we5PooU1EF6RlCAd3ewYoo+9fPnwjNYwIWH5i5wB7
 +tnei44NACAw9cfbos+BYQQ/dY15OSFzLzIMomlabB7OpXOdDg3H6tJnPbFwWwXb
 9nF8XdHqxeDVVVrDCAx1BSodSXm9xqgnQM2RDGTUnpVcAfqAr3MXX6VsyKQDzj8T
 QXF9qOVCBAABv6BXAvSQ6mvMJZDUVbUPEPhf7kXzF46JsRd6A7wWoU/OnMGHQ/w7
 wjvH8HVy
 =fWBV
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for v5.14.

- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration
  and apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
2021-06-25 11:24:24 -04:00
Jing Zhang
bc9e9e672d KVM: debugfs: Reuse binary stats descriptors
To remove code duplication, use the binary stats descriptors in the
implementation of the debugfs interface for statistics. This unifies
the definition of statistics for the binary and debugfs interfaces.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-8-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 18:00:29 -04:00
Jing Zhang
ce55c04945 KVM: stats: Support binary stats retrieval for a VCPU
Add a VCPU ioctl to get a statistics file descriptor by which a read
functionality is provided for userspace to read out VCPU stats header,
descriptors and data.
Define VCPU statistics descriptors and header for all architectures.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com> #arm64
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-5-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 18:00:19 -04:00
Jing Zhang
fcfe1baedd KVM: stats: Support binary stats retrieval for a VM
Add a VM ioctl to get a statistics file descriptor by which a read
functionality is provided for userspace to read out VM stats header,
descriptors and data.
Define VM statistics descriptors and header for all architectures.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com> #arm64
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-4-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 18:00:10 -04:00
Martynas Pumputis
e8b9eab992 net: retrieve netns cookie via getsocketopt
It's getting more common to run nested container environments for
testing cloud software. One of such examples is Kind [1] which runs a
Kubernetes cluster in Docker containers on a single host. Each container
acts as a Kubernetes node, and thus can run any Pod (aka container)
inside the former. This approach simplifies testing a lot, as it
eliminates complicated VM setups.

Unfortunately, such a setup breaks some functionality when cgroupv2 BPF
programs are used for load-balancing. The load-balancer BPF program
needs to detect whether a request originates from the host netns or a
container netns in order to allow some access, e.g. to a service via a
loopback IP address. Typically, the programs detect this by comparing
netns cookies with the one of the init ns via a call to
bpf_get_netns_cookie(NULL). However, in nested environments the latter
cannot be used given the Kubernetes node's netns is outside the init ns.
To fix this, we need to pass the Kubernetes node netns cookie to the
program in a different way: by extending getsockopt() with a
SO_NETNS_COOKIE option, the orchestrator which runs in the Kubernetes
node netns can retrieve the cookie and pass it to the program instead.

Thus, this is following up on Eric's commit 3d368ab87c ("net:
initialize net->net_cookie at netns setup") to allow retrieval via
SO_NETNS_COOKIE.  This is also in line in how we retrieve socket cookie
via SO_COOKIE.

  [1] https://kind.sigs.k8s.io/

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24 11:13:05 -07:00
Jing Zhang
cb082bfab5 KVM: stats: Add fd-based API to read binary stats data
This commit defines the API for userspace and prepare the common
functionalities to support per VM/VCPU binary stats data readings.

The KVM stats now is only accessible by debugfs, which has some
shortcomings this change series are supposed to fix:
1. The current debugfs stats solution in KVM could be disabled
   when kernel Lockdown mode is enabled, which is a potential
   rick for production.
2. The current debugfs stats solution in KVM is organized as "one
   stats per file", it is good for debugging, but not efficient
   for production.
3. The stats read/clear in current debugfs solution in KVM are
   protected by the global kvm_lock.

Besides that, there are some other benefits with this change:
1. All KVM VM/VCPU stats can be read out in a bulk by one copy
   to userspace.
2. A schema is used to describe KVM statistics. From userspace's
   perspective, the KVM statistics are self-describing.
3. With the fd-based solution, a separate telemetry would be able
   to read KVM stats in a less privileged environment.
4. After the initial setup by reading in stats descriptors, a
   telemetry only needs to read the stats data itself, no more
   parsing or setup is needed.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com> #arm64
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-3-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 11:47:57 -04:00
Jing Zhang
0193cc908b KVM: stats: Separate generic stats from architecture specific ones
Generic KVM stats are those collected in architecture independent code
or those supported by all architectures; put all generic statistics in
a separate structure.  This ensures that they are defined the same way
in the statistics API which is being added, removing duplication among
different architectures in the declaration of the descriptors.

No functional change intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-2-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 11:47:56 -04:00
zhouchuangao
a2cdc24e20 mips/kvm: Use BUG_ON instead of if condition followed by BUG
BUG_ON uses unlikely in if(), it can be optimized at compile time.

Usually, the condition in if() is not satisfied. In my opinion,
this can improve the efficiency of the multi-stage pipeline.

Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-06-21 11:40:54 +02:00