Commit graph

464 commits

Author SHA1 Message Date
Mike Kravetz
aafd4562df mm: arch: consolidate mmap hugetlb size encodings
A non-default huge page size can be encoded in the flags argument of the
mmap system call.  The definitions for these encodings are in arch
specific header files.  However, all architectures use the same values.

Consolidate all the definitions in the primary user header file
(uapi/linux/mman.h).  Include definitions for all known huge page sizes.
Use the generic encoding definitions in hugetlb_encode.h as the basis
for these definitions.

Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:28 -07:00
Linus Torvalds
6caffe21dd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha updates from Matt Turner:
 "This contains some small clean up patches I've neglected, and some
  build improvements from Ben Hutchings"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  alpha: math-emu: Fix modular build
  alpha: Restore symbol versions for symbols exported from assembly
  alpha: defconfig: Cleanup from old Kconfig options
  alpha: use kobj_to_dev()
  alpha: squash lines for immediate return
  alpha: kernel: Use vma_pages()
  alpha: silence a buffer overflow warning
  alpha: marvel: make use of raw_spinlock variants
  alpha: cleanup: remove __NR_sys_epoll_*, leave __NR_epoll_*
  alpha: use generic fb.h
2017-09-05 09:41:38 -07:00
Ben Hutchings
873f9b5bcb alpha: Restore symbol versions for symbols exported from assembly
Add <asm/asm-prototypes.h> so that genksyms knows the types of
these symbols and can generate CRCs for them.

Fixes: 00fc0e0dda ("alpha: move exports to actual definitions")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-09-04 12:04:34 -07:00
Julia Cartwright
b5a3a128b4 alpha: marvel: make use of raw_spinlock variants
The alpha/marvel code currently implements an irq_chip for handling
interrupts; due to how irq_chip handling is done, it's necessary for the
irq_chip methods to be invoked from hardirq context, even on a a
real-time kernel.  Because the spinlock_t type becomes a "sleeping"
spinlock w/ RT kernels, it is not suitable to be used with irq_chips.

A quick audit of the operations under the lock reveal that they do only
minimal, bounded work, and are therefore safe to do under a raw spinlock.

Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-09-04 12:04:31 -07:00
Sergei Trofimovich
beb1057feb alpha: cleanup: remove __NR_sys_epoll_*, leave __NR_epoll_*
__NR_sys_epoll_create and friends are alpha-specific
while __NR_epoll_create is a generic name for other
arches.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-09-04 12:04:31 -07:00
Tobias Klauser
1c0234aac8 alpha: use generic fb.h
The arch uses a verbatim copy of the asm-generic version and does not
add any own implemntations to the header, so use asm-generic/fb.h
instead of duplicating code.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-09-04 12:04:30 -07:00
Linus Torvalds
5f82e71a00 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:

 - Add 'cross-release' support to lockdep, which allows APIs like
   completions, where it's not the 'owner' who releases the lock, to be
   tracked. It's all activated automatically under
   CONFIG_PROVE_LOCKING=y.

 - Clean up (restructure) the x86 atomics op implementation to be more
   readable, in preparation of KASAN annotations. (Dmitry Vyukov)

 - Fix static keys (Paolo Bonzini)

 - Add killable versions of down_read() et al (Kirill Tkhai)

 - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini)

 - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra)

 - Remove smp_mb__before_spinlock() and convert its usages, introduce
   smp_mb__after_spinlock() (Peter Zijlstra)

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits)
  locking/lockdep/selftests: Fix mixed read-write ABBA tests
  sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK()
  acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse
  locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures
  smp: Avoid using two cache lines for struct call_single_data
  locking/lockdep: Untangle xhlock history save/restore from task independence
  locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being
  futex: Remove duplicated code and fix undefined behaviour
  Documentation/locking/atomic: Finish the document...
  locking/lockdep: Fix workqueue crossrelease annotation
  workqueue/lockdep: 'Fix' flush_work() annotation
  locking/lockdep/selftests: Add mixed read-write ABBA tests
  mm, locking/barriers: Clarify tlb_flush_pending() barriers
  locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive
  locking/lockdep: Explicitly initialize wq_barrier::done::map
  locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS
  locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config
  locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING
  locking/refcounts, x86/asm: Implement fast refcount overflow protection
  locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease
  ...
2017-09-04 11:52:29 -07:00
Linus Torvalds
0081a0ce80 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnad:
 "The main RCU related changes in this cycle were:

   - Removal of spin_unlock_wait()
   - SRCU updates
   - RCU torture-test updates
   - RCU Documentation updates
   - Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant
   - Miscellaneous RCU fixes
   - CPU-hotplug fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  arch: Remove spin_unlock_wait() arch-specific definitions
  locking: Remove spin_unlock_wait() generic definitions
  drivers/ata: Replace spin_unlock_wait() with lock/unlock pair
  ipc: Replace spin_unlock_wait() with lock/unlock pair
  exit: Replace spin_unlock_wait() with lock/unlock pair
  completion: Replace spin_unlock_wait() with lock/unlock pair
  doc: Set down RCU's scheduling-clock-interrupt needs
  doc: No longer allowed to use rcu_dereference on non-pointers
  doc: Add RCU files to docbook-generation files
  doc: Update memory-barriers.txt for read-to-write dependencies
  doc: Update RCU documentation
  membarrier: Provide expedited private command
  rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter()
  rcu: Add warning to rcu_idle_enter() for irqs enabled
  rcu: Make rcu_idle_enter() rely on callers disabling irqs
  rcu: Add assertions verifying blocked-tasks list
  rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit()
  rcu: Add TPS() protection for _rcu_barrier_trace strings
  rcu: Use idle versions of swait to make idle-hack clear
  swait: Add idle variants which don't contribute to load average
  ...
2017-09-04 08:13:52 -07:00
Ingo Molnar
edc2988c54 Merge branch 'linus' into locking/core, to fix up conflicts
Conflicts:
	mm/page_alloc.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-04 11:01:18 +02:00
Ben Hutchings
cec80d8214 alpha: uapi: Add support for __SANE_USERSPACE_TYPES__
This fixes compiler errors in perf such as:

tests/attr.c: In function 'store_event':
tests/attr.c:66:27: error: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__u64 {aka long unsigned int}' [-Werror=format=]
  snprintf(path, PATH_MAX, "%s/event-%d-%llu-%d", dir,
                           ^

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-08-29 12:02:00 -07:00
Guenter Roeck
7817cedce0 alpha: Define ioremap_wc
Commit 3cc2dac5be ("drivers/video/fbdev/atyfb: Replace MTRR UC hole
with strong UC") introduces calls to ioremap_wc and ioremap_uc. This
causes build failures with alpha:allmodconfig. Map the missing functions
to ioremap_nocache.

Fixes: 3cc2dac5be ("drivers/video/fbdev/atyfb:
        Replace MTRR UC hole with strong UC")
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-08-29 12:01:57 -07:00
Richard Henderson
a720830613 alpha: Update for new syscalls
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2017-08-29 12:01:43 -07:00
Jiri Slaby
30d6e0a419 futex: Remove duplicated code and fix undefined behaviour
There is code duplicated over all architecture's headers for
futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
and comparison of the result.

Remove this duplication and leave up to the arches only the needed
assembly which is now in arch_futex_atomic_op_inuser.

This effectively distributes the Will Deacon's arm64 fix for undefined
behaviour reported by UBSAN to all architectures. The fix was done in
commit 5f16a046f8 (arm64: futex: Fix undefined behaviour with
FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.

And as suggested by Thomas, check for negative oparg too, because it was
also reported to cause undefined behaviour report.

Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
remove pointless access_ok() checks") as access_ok there returns true.
We introduce it back to the helper for the sake of simplicity (it gets
optimized away anyway).

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
Reviewed-by: Darren Hart (VMware) <dvhart@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
Cc: linux-mips@linux-mips.org
Cc: Rich Felker <dalias@libc.org>
Cc: linux-ia64@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: peterz@infradead.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: sparclinux@vger.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: linux-s390@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: linux-hexagon@vger.kernel.org
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-xtensa@linux-xtensa.org
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: openrisc@lists.librecores.org
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Stafford Horne <shorne@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Richard Henderson <rth@twiddle.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-parisc@vger.kernel.org
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-alpha@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
2017-08-25 22:49:59 +02:00
Paul E. McKenney
952111d7db arch: Remove spin_unlock_wait() arch-specific definitions
There is no agreed-upon definition of spin_unlock_wait()'s semantics,
and it appears that all callers could do just as well with a lock/unlock
pair.  This commit therefore removes the underlying arch-specific
arch_spin_unlock_wait() for all architectures providing them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
2017-08-17 08:08:59 -07:00
Gleb Fotengauer-Malinovskiy
c632517923 tty: Fix TIOCGPTPEER ioctl definition
This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag
because it doesn't copy anything from/to userspace to access the
argument.

Fixes: 54ebbfb160 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Aleksa Sarai <asarai@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-17 17:04:41 +02:00
Linus Torvalds
dc502142b6 Merge branch 'uaccess.strlen' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull user access str* updates from Al Viro:
 "uaccess str...() dead code removal"

* 'uaccess.strlen' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  s390 keyboard.c: don't open-code strndup_user()
  mips: get rid of unused __strnlen_user()
  get rid of unused __strncpy_from_user() instances
  kill strlen_user()
2017-07-06 22:07:44 -07:00
Linus Torvalds
c856863988 Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc compat stuff updates from Al Viro:
 "This part is basically untangling various compat stuff. Compat
  syscalls moved to their native counterparts, getting rid of quite a
  bit of double-copying and/or set_fs() uses. A lot of field-by-field
  copyin/copyout killed off.

   - kernel/compat.c is much closer to containing just the
     copyin/copyout of compat structs. Not all compat syscalls are gone
     from it yet, but it's getting there.

   - ipc/compat_mq.c killed off completely.

   - block/compat_ioctl.c cleaned up; floppy compat ioctls moved to
     drivers/block/floppy.c where they belong. Yes, there are several
     drivers that implement some of the same ioctls. Some are m68k and
     one is 32bit-only pmac. drivers/block/floppy.c is the only one in
     that bunch that can be built on biarch"

* 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  mqueue: move compat syscalls to native ones
  usbdevfs: get rid of field-by-field copyin
  compat_hdio_ioctl: get rid of set_fs()
  take floppy compat ioctls to sodding floppy.c
  ipmi: get rid of field-by-field __get_user()
  ipmi: get COMPAT_IPMICTL_RECEIVE_MSG in sync with the native one
  rt_sigtimedwait(): move compat to native
  select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()
  put_compat_rusage(): switch to copy_to_user()
  sigpending(): move compat to native
  getrlimit()/setrlimit(): move compat to native
  times(2): move compat to native
  compat_{get,put}_bitmap(): use unsafe_{get,put}_user()
  fb_get_fscreeninfo(): don't bother with do_fb_ioctl()
  do_sigaltstack(): lift copying to/from userland into callers
  take compat_sys_old_getrlimit() to native syscall
  trim __ARCH_WANT_SYS_OLD_GETRLIMIT
2017-07-06 20:57:13 -07:00
Linus Torvalds
5518b69b76 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Reasonably busy this cycle, but perhaps not as busy as in the 4.12
  merge window:

   1) Several optimizations for UDP processing under high load from
      Paolo Abeni.

   2) Support pacing internally in TCP when using the sch_fq packet
      scheduler for this is not practical. From Eric Dumazet.

   3) Support mutliple filter chains per qdisc, from Jiri Pirko.

   4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

   5) Add batch dequeueing to vhost_net, from Jason Wang.

   6) Flesh out more completely SCTP checksum offload support, from
      Davide Caratti.

   7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
      Neira Ayuso, and Matthias Schiffer.

   8) Add devlink support to nfp driver, from Simon Horman.

   9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
      Prabhu.

  10) Add stack depth tracking to BPF verifier and use this information
      in the various eBPF JITs. From Alexei Starovoitov.

  11) Support XDP on qed device VFs, from Yuval Mintz.

  12) Introduce BPF PROG ID for better introspection of installed BPF
      programs. From Martin KaFai Lau.

  13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

  14) For loads, allow narrower accesses in bpf verifier checking, from
      Yonghong Song.

  15) Support MIPS in the BPF selftests and samples infrastructure, the
      MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
      Daney.

  16) Support kernel based TLS, from Dave Watson and others.

  17) Remove completely DST garbage collection, from Wei Wang.

  18) Allow installing TCP MD5 rules using prefixes, from Ivan
      Delalande.

  19) Add XDP support to Intel i40e driver, from Björn Töpel

  20) Add support for TC flower offload in nfp driver, from Simon
      Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
      Kicinski, and Bert van Leeuwen.

  21) IPSEC offloading support in mlx5, from Ilan Tayari.

  22) Add HW PTP support to macb driver, from Rafal Ozieblo.

  23) Networking refcount_t conversions, From Elena Reshetova.

  24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
      for tuning the TCP sockopt settings of a group of applications,
      currently via CGROUPs"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
  net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
  dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
  cxgb4: Support for get_ts_info ethtool method
  cxgb4: Add PTP Hardware Clock (PHC) support
  cxgb4: time stamping interface for PTP
  nfp: default to chained metadata prepend format
  nfp: remove legacy MAC address lookup
  nfp: improve order of interfaces in breakout mode
  net: macb: remove extraneous return when MACB_EXT_DESC is defined
  bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
  bpf: fix return in load_bpf_file
  mpls: fix rtm policy in mpls_getroute
  net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
  net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
  net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
  net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
  ...
2017-07-05 12:31:59 -07:00
David Herrmann
28b5ba2aa0 net: introduce SO_PEERGROUPS getsockopt
This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
retrieve the auxiliary groups of the remote peer. It is designed to
naturally extend SO_PEERCRED. That is, the underlying data is from the
same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
is, if the provided buffer is too small, ERANGE is returned and @optlen
is updated. Otherwise, the information is copied, @optlen is set to the
actual size, and 0 is returned.

While SO_PEERCRED (and thus `struct ucred') already returns the primary
group, it lacks the auxiliary group vector. However, nearly all access
controls (including kernel side VFS and SYSVIPC, but also user-space
polkit, DBus, ...) consider the entire set of groups, rather than just
the primary group. But this is currently not possible with pure
SO_PEERCRED. Instead, user-space has to work around this and query the
system database for the auxiliary groups of a UID retrieved via
SO_PEERCRED.

Unfortunately, there is no race-free way to query the auxiliary groups
of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
solution is to use getgrouplist(3p), which itself falls back to NSS and
whatever is configured in nsswitch.conf(3). This effectively checks
which groups we *would* assign to the user if it logged in *now*. On
normal systems it is as easy as reading /etc/group, but with NSS it can
resort to quering network databases (eg., LDAP), using IPC or network
communication.

Long story short: Whenever we want to use auxiliary groups for access
checks on IPC, we need further IPC to talk to the user/group databases,
rather than just relying on SO_PEERCRED and the incoming socket. This
is unfortunate, and might even result in dead-locks if the database
query uses the same IPC as the original request.

So far, those recursions / dead-locks have been avoided by using
primitive IPC for all crucial NSS modules. However, we want to avoid
re-inventing the wheel for each NSS module that might be involved in
user/group queries. Hence, we would preferably make DBus (and other IPC
that supports access-management based on groups) work without resorting
to the user/group database. This new SO_PEERGROUPS ioctl would allow us
to make dbus-daemon work without ever calling into NSS.

Cc: Michal Sekletar <msekleta@redhat.com>
Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
Reviewed-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21 11:38:41 -04:00
Aleksa Sarai
54ebbfb160 tty: add TIOCGPTPEER ioctl
When opening the slave end of a PTY, it is not possible for userspace to
safely ensure that /dev/pts/$num is actually a slave (in cases where the
mount namespace in which devpts was mounted is controlled by an
untrusted process). In addition, there are several unresolvable
race conditions if userspace were to attempt to detect attacks through
stat(2) and other similar methods [in addition it is not clear how
userspace could detect attacks involving FUSE].

Resolve this by providing an interface for userpace to safely open the
"peer" end of a PTY file descriptor by using the dentry cached by
devpts. Since it is not possible to have an open master PTY without
having its slave exposed in /dev/pts this interface is safe. This
interface currently does not provide a way to get the master pty (since
it is not clear whether such an interface is safe or even useful).

Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Valentin Rothberg <vrothberg@suse.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-09 12:27:54 +02:00
Al Viro
123dbfe088 trim __ARCH_WANT_SYS_OLD_GETRLIMIT
alpha. m32r, mips and parisc define it, even though the syscall itself
is not wired.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-27 15:38:02 -04:00
David S. Miller
1c4f676a68 net: Define SCM_TIMESTAMPING_PKTINFO on all architectures.
A definition was only provided for asm-generic/socket.h
using platforms, define it for the others as well

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21 23:13:37 -04:00
Al Viro
8298525839 kill strlen_user()
no callers, no consistent semantics, no sane way to use it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-05-15 23:40:22 -04:00
Nicolas Dichtel
fcc8487d47 uapi: export all headers under uapi directories
Regularly, when a new header is created in include/uapi/, the developer
forgets to add it in the corresponding Kbuild file. This error is usually
detected after the release is out.

In fact, all headers under uapi directories should be exported, thus it's
useless to have an exhaustive list.

After this patch, the following files, which were not exported, are now
exported (with make headers_install_all):
asm-arc/kvm_para.h
asm-arc/ucontext.h
asm-blackfin/shmparam.h
asm-blackfin/ucontext.h
asm-c6x/shmparam.h
asm-c6x/ucontext.h
asm-cris/kvm_para.h
asm-h8300/shmparam.h
asm-h8300/ucontext.h
asm-hexagon/shmparam.h
asm-m32r/kvm_para.h
asm-m68k/kvm_para.h
asm-m68k/shmparam.h
asm-metag/kvm_para.h
asm-metag/shmparam.h
asm-metag/ucontext.h
asm-mips/hwcap.h
asm-mips/reg.h
asm-mips/ucontext.h
asm-nios2/kvm_para.h
asm-nios2/ucontext.h
asm-openrisc/shmparam.h
asm-parisc/kvm_para.h
asm-powerpc/perf_regs.h
asm-sh/kvm_para.h
asm-sh/ucontext.h
asm-tile/shmparam.h
asm-unicore32/shmparam.h
asm-unicore32/ucontext.h
asm-x86/hwcap2.h
asm-xtensa/kvm_para.h
drm/armada_drm.h
drm/etnaviv_drm.h
drm/vgem_drm.h
linux/aspeed-lpc-ctrl.h
linux/auto_dev-ioctl.h
linux/bcache.h
linux/btrfs_tree.h
linux/can/vxcan.h
linux/cifs/cifs_mount.h
linux/coresight-stm.h
linux/cryptouser.h
linux/fsmap.h
linux/genwqe/genwqe_card.h
linux/hash_info.h
linux/kcm.h
linux/kcov.h
linux/kfd_ioctl.h
linux/lightnvm.h
linux/module.h
linux/nbd-netlink.h
linux/nilfs2_api.h
linux/nilfs2_ondisk.h
linux/nsfs.h
linux/pr.h
linux/qrtr.h
linux/rpmsg.h
linux/sched/types.h
linux/sed-opal.h
linux/smc.h
linux/smc_diag.h
linux/stm.h
linux/switchtec_ioctl.h
linux/vfio_ccw.h
linux/wil6210_uapi.h
rdma/bnxt_re-abi.h

Note that I have removed from this list the files which are generated in every
exported directories (like .install or .install.cmd).

Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all
subdirs with a pure makefile command.

For the record, note that exported files for asm directories are a mix of
files listed by:
 - include/uapi/asm-generic/Kbuild.asm;
 - arch/<arch>/include/uapi/asm/Kbuild;
 - arch/<arch>/include/asm/Kbuild.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Mark Salter <msalter@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-05-11 00:21:54 +09:00
Linus Torvalds
8d65b08deb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Millar:
 "Here are some highlights from the 2065 networking commits that
  happened this development cycle:

   1) XDP support for IXGBE (John Fastabend) and thunderx (Sunil Kowuri)

   2) Add a generic XDP driver, so that anyone can test XDP even if they
      lack a networking device whose driver has explicit XDP support
      (me).

   3) Sparc64 now has an eBPF JIT too (me)

   4) Add a BPF program testing framework via BPF_PROG_TEST_RUN (Alexei
      Starovoitov)

   5) Make netfitler network namespace teardown less expensive (Florian
      Westphal)

   6) Add symmetric hashing support to nft_hash (Laura Garcia Liebana)

   7) Implement NAPI and GRO in netvsc driver (Stephen Hemminger)

   8) Support TC flower offload statistics in mlxsw (Arkadi Sharshevsky)

   9) Multiqueue support in stmmac driver (Joao Pinto)

  10) Remove TCP timewait recycling, it never really could possibly work
      well in the real world and timestamp randomization really zaps any
      hint of usability this feature had (Soheil Hassas Yeganeh)

  11) Support level3 vs level4 ECMP route hashing in ipv4 (Nikolay
      Aleksandrov)

  12) Add socket busy poll support to epoll (Sridhar Samudrala)

  13) Netlink extended ACK support (Johannes Berg, Pablo Neira Ayuso,
      and several others)

  14) IPSEC hw offload infrastructure (Steffen Klassert)"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2065 commits)
  tipc: refactor function tipc_sk_recv_stream()
  tipc: refactor function tipc_sk_recvmsg()
  net: thunderx: Optimize page recycling for XDP
  net: thunderx: Support for XDP header adjustment
  net: thunderx: Add support for XDP_TX
  net: thunderx: Add support for XDP_DROP
  net: thunderx: Add basic XDP support
  net: thunderx: Cleanup receive buffer allocation
  net: thunderx: Optimize CQE_TX handling
  net: thunderx: Optimize RBDR descriptor handling
  net: thunderx: Support for page recycling
  ipx: call ipxitf_put() in ioctl error path
  net: sched: add helpers to handle extended actions
  qed*: Fix issues in the ptp filter config implementation.
  qede: Fix concurrency issue in PTP Tx path processing.
  stmmac: Add support for SIMATIC IOT2000 platform
  net: hns: fix ethtool_get_strings overflow in hns driver
  tcp: fix wraparound issue in tcp_lp
  bpf, arm64: fix jit branch offset related to ldimm64
  bpf, arm64: implement jiting of BPF_XADD
  ...
2017-05-02 16:40:27 -07:00
Chenbo Feng
5daab9db7b New getsockopt option to get socket cookie
Introduce a new getsockopt operation to retrieve the socket cookie
for a specific socket based on the socket fd.  It returns a unique
non-decreasing cookie for each socket.
Tested: https://android-review.googlesource.com/#/c/358163/

Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-08 08:07:01 -07:00
Al Viro
ec022681a4 alpha: switch to RAW_COPY_USER
copy_{to,from}_user() uninlined.  We can go back to inlined variants,
if we want to...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 18:23:19 -04:00
Al Viro
ca282f6973 alpha: add a helper for emitting exception table entries
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 18:23:19 -04:00
Al Viro
c9df6025a6 alpha: kill the 'segment' argument of __access_ok()
always equal to get_fs().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 18:23:18 -04:00
Al Viro
d78d834bed alpha: get rid of 'segment' argument of __{get,put}_user_check()
always equal to get_fs()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 18:23:17 -04:00
Al Viro
b5478c1b67 alpha: add asm/extable.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 18:23:17 -04:00
Al Viro
8525023121 alpha: switch __copy_user() and __do_clean_user() to normal calling conventions
They used to need odd calling conventions due to old exception handling
mechanism, the last remnants of which had disappeared back in 2002.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 18:23:16 -04:00
Al Viro
db68ce10c4 new helper: uaccess_kernel()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-28 16:43:25 -04:00
Sridhar Samudrala
6d4339028b net: Introduce SO_INCOMING_NAPI_ID
This socket option returns the NAPI ID associated with the queue on which
the last frame is received. This information can be used by the apps to
split the incoming flows among the threads based on the Rx queue on which
they are received.

If the NAPI ID actually represents a sender_cpu then the value is ignored
and 0 is returned.

Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-24 20:49:31 -07:00
Josh Hunt
a2d133b1d4 sock: introduce SO_MEMINFO getsockopt
Allows reading of SK_MEMINFO_VARS via socket option. This way an
application can get all meminfo related information in single socket
option call instead of multiple calls.

Adds helper function, sk_get_meminfo(), and uses that for both
getsockopt and sock_diag_put_meminfo().

Suggested by Eric Dumazet.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22 11:18:58 -07:00
Al Viro
af1d5b37d6 uaccess: drop duplicate includes from asm/uaccess.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-05 21:57:49 -05:00
Al Viro
5e6039d8a3 uaccess: move VERIFY_{READ,WRITE} definitions to linux/uaccess.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-05 20:40:25 -05:00
Ingo Molnar
589ee62844 sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h>
Update code that relied on sched.h including various MM types for them.

This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:37 +01:00
Luis R. Rodriguez
7d134b2ce6 kprobes: move kprobe declarations to asm-generic/kprobes.h
Often all is needed is these small helpers, instead of compiler.h or a
full kprobes.h.  This is important for asm helpers, in fact even some
asm/kprobes.h make use of these helpers...  instead just keep a generic
asm file with helpers useful for asm code with the least amount of
clutter as possible.

Likewise we need now to also address what to do about this file for both
when architectures have CONFIG_HAVE_KPROBES, and when they do not.  Then
for when architectures have CONFIG_HAVE_KPROBES but have disabled
CONFIG_KPROBES.

Right now most asm/kprobes.h do not have guards against CONFIG_KPROBES,
this means most architecture code cannot include asm/kprobes.h safely.
Correct this and add guards for architectures missing them.
Additionally provide architectures that not have kprobes support with
the default asm-generic solution.  This lets us force asm/kprobes.h on
the header include/linux/kprobes.h always, but most importantly we can
now safely include just asm/kprobes.h on architecture code without
bringing the full kitchen sink of header files.

Two architectures already provided a guard against CONFIG_KPROBES on its
kprobes.h: sh, arch.  The rest of the architectures needed gaurds added.
We avoid including any not-needed headers on asm/kprobes.h unless
kprobes have been enabled.

In a subsequent atomic change we can try now to remove compiler.h from
include/linux/kprobes.h.

During this sweep I've also identified a few architectures defining a
common macro needed for both kprobes and ftrace, that of the definition
of the breakput instruction up.  Some refer to this as
BREAKPOINT_INSTRUCTION.  This must be kept outside of the #ifdef
CONFIG_KPROBES guard.

[mcgrof@kernel.org: fix arm64 build]
  Link: http://lkml.kernel.org/r/CAB=NE6X1WMByuARS4mZ1g9+W=LuVBnMDnh_5zyN0CLADaVh=Jw@mail.gmail.com
[sfr@canb.auug.org.au: fixup for kprobes declarations moving]
  Link: http://lkml.kernel.org/r/20170214165933.13ebd4f4@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170203233139.32682-1-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Linus Torvalds
ac1820fb28 This is a tree wide change and has been kept separate for that reason.
Bart Van Assche noted that the ib DMA mapping code was significantly
 similar enough to the core DMA mapping code that with a few changes
 it was possible to remove the IB DMA mapping code entirely and
 switch the RDMA stack to use the core DMA mapping code.  This resulted
 in a nice set of cleanups, but touched the entire tree.  This branch
 will be submitted separately to Linus at the end of the merge window
 as per normal practice for tree wide changes like this.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
 2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
 CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
 Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
 xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
 0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
 1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
 ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
 IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
 ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
 XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
 3e7mgpcwC+3XfA/6vU3F
 =e0Si
 -----END PGP SIGNATURE-----

Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma DMA mapping updates from Doug Ledford:
 "Drop IB DMA mapping code and use core DMA code instead.

  Bart Van Assche noted that the ib DMA mapping code was significantly
  similar enough to the core DMA mapping code that with a few changes it
  was possible to remove the IB DMA mapping code entirely and switch the
  RDMA stack to use the core DMA mapping code.

  This resulted in a nice set of cleanups, but touched the entire tree
  and has been kept separate for that reason."

* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
  IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
  IB/core: Remove ib_device.dma_device
  nvme-rdma: Switch from dma_device to dev.parent
  RDS: net: Switch from dma_device to dev.parent
  IB/srpt: Modify a debug statement
  IB/srp: Switch from dma_device to dev.parent
  IB/iser: Switch from dma_device to dev.parent
  IB/IPoIB: Switch from dma_device to dev.parent
  IB/rxe: Switch from dma_device to dev.parent
  IB/vmw_pvrdma: Switch from dma_device to dev.parent
  IB/usnic: Switch from dma_device to dev.parent
  IB/qib: Switch from dma_device to dev.parent
  IB/qedr: Switch from dma_device to dev.parent
  IB/ocrdma: Switch from dma_device to dev.parent
  IB/nes: Remove a superfluous assignment statement
  IB/mthca: Switch from dma_device to dev.parent
  IB/mlx5: Switch from dma_device to dev.parent
  IB/mlx4: Switch from dma_device to dev.parent
  IB/i40iw: Remove a superfluous assignment statement
  IB/hns: Switch from dma_device to dev.parent
  ...
2017-02-25 13:45:43 -08:00
Davidlohr Bueso
35ca6953ca alpha: use generic current.h
Given that the arch does not add its own implementations, simply use the
asm-generic/current.h (generic-y) header instead of duplicating code.

Link: http://lkml.kernel.org/r/1485992878-4780-2-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:56 -08:00
Frederic Weisbecker
b672592f02 sched/cputime: Remove generic asm headers
cputime_t is now only used by two architectures:

	* powerpc (when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y)
	* s390

And since the core doesn't use it anymore, we don't need any arch support
from the others. So we can remove their stub implementations.

A final cleanup would be to provide an efficient pure arch
implementation of cputime_to_nsec() for s390 and powerpc and finally
remove include/linux/cputime.h .

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-36-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01 09:14:07 +01:00
Bart Van Assche
815dd18788 treewide: Consolidate get_dma_ops() implementations
Introduce a new architecture-specific get_arch_dma_ops() function
that takes a struct bus_type * argument. Add get_dma_ops() in
<linux/dma-mapping.h>.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Bart Van Assche
5299709d0a treewide: Constify most dma_map_ops structures
Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This patch
has been generated as follows:

git grep -l 'struct dma_map_ops' |
  xargs -d\\n sed -i \
    -e 's/struct dma_map_ops/const struct dma_map_ops/g' \
    -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \
    -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \
    -e 's/const const struct dma_map_ops /const struct dma_map_ops /g';
sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops intel_dma_ops');
sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc);
sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \
       -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \
       -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \
    drivers/pci/host/*.c
sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c
sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c
sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Linus Torvalds
6cdf89b1ca Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The tree got pretty big in this development cycle, but the net effect
  is pretty good:

    115 files changed, 673 insertions(+), 1522 deletions(-)

  The main changes were:

   - Rework and generalize the mutex code to remove per arch mutex
     primitives. (Peter Zijlstra)

   - Add vCPU preemption support: add an interface to query the
     preemption status of vCPUs and use it in locking primitives - this
     optimizes paravirt performance. (Pan Xinhui, Juergen Gross,
     Christian Borntraeger)

   - Introduce cpu_relax_yield() and remov cpu_relax_lowlatency() to
     clean up and improve the s390 lock yielding machinery and its core
     kernel impact. (Christian Borntraeger)

   - Micro-optimize mutexes some more. (Waiman Long)

   - Reluctantly add the to-be-deprecated mutex_trylock_recursive()
     interface on a temporary basis, to give the DRM code more time to
     get rid of its locking hacks. Any other users will be NAK-ed on
     sight. (We turned off the deprecation warning for the time being to
     not pollute the build log.) (Peter Zijlstra)

   - Improve the rtmutex code a bit, in light of recent long lived
     bugs/races. (Thomas Gleixner)

   - Misc fixes, cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/paravirt: Fix bool return type for PVOP_CALL()
  x86/paravirt: Fix native_patch()
  locking/ww_mutex: Use relaxed atomics
  locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked()
  locking/rtmutex: Get rid of RT_MUTEX_OWNER_MASKALL
  x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()
  locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted
  locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock()
  Documentation/virtual/kvm: Support the vCPU preemption check
  x86/xen: Support the vCPU preemption check
  x86/kvm: Support the vCPU preemption check
  x86/kvm: Support the vCPU preemption check
  kvm: Introduce kvm_write_guest_offset_cached()
  locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests
  locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
  locking/core, powerpc: Implement vcpu_is_preempted(cpu)
  sched/core: Introduce the vcpu_is_preempted(cpu) interface
  sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q
  locking/core: Provide common cpu_relax_yield() definition
  locking/mutex: Don't mark mutex_trylock_recursive() as deprecated, temporarily
  ...
2016-12-12 10:48:02 -08:00
Francis Yan
1c885808e4 tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING
This patch exports the sender chronograph stats via the socket
SO_TIMESTAMPING channel. Currently we can instrument how long a
particular application unit of data was queued in TCP by tracking
SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
these sender chronograph stats exported simultaneously along with
these timestamps allow further breaking down the various sender
limitation.  For example, a video server can tell if a particular
chunk of video on a connection takes a long time to deliver because
TCP was experiencing small receive window. It is not possible to
tell before this patch without packet traces.

To prepare these stats, the user needs to set
SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
while requesting other SOF_TIMESTAMPING TX timestamps. When the
timestamps are available in the error queue, the stats are returned
in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.

Signed-off-by: Francis Yan <francisyyan@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 10:04:25 -05:00
Christian Borntraeger
6d0d287891 locking/core: Provide common cpu_relax_yield() definition
No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:17:36 +01:00
Christian Borntraeger
5bd0b85ba8 locking/core, arch: Remove cpu_relax_lowlatency()
As there are no users left, we can remove cpu_relax_lowlatency()
implementations from every architecture.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Cc: <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:11 +01:00
Christian Borntraeger
79ab11cdb9 locking/core: Introduce cpu_relax_yield()
For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:09 +01:00
Peter Zijlstra
890658b7ab locking/mutex: Kill arch specific code
Its all generic atomic_long_t stuff now.

Tested-by: Jason Low <jason.low2@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 11:31:51 +02:00