Commit graph

40429 commits

Author SHA1 Message Date
Peter Zijlstra
dfd01f0260 sched/wait: Fix the signal handling fix
Jan Stancek reported that I wrecked things for him by fixing things for
Vladimir :/

His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which
should not be possible, however my previous patch made this possible by
unconditionally checking signal_pending().

We cannot use current->state as was done previously, because the
instruction after the store to that variable it can be changed.  We must
instead pass the initial state along and use that.

Fixes: 68985633bc ("sched/wait: Fix signal handling in bit wait helpers")
Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Chris Mason <clm@fb.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Paul Turner <pjt@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: tglx@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: hpa@zytor.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-13 14:30:59 -08:00
Xin Long
a907e36d54 netfilter: nf_tables: use reverse traversal commit_list in nf_tables_abort
When we use 'nft -f' to submit rules, it will build multiple rules into
one netlink skb to send to kernel, kernel will process them one by one.
meanwhile, it add the trans into commit_list to record every commit.
if one of them's return value is -EAGAIN, status |= NFNL_BATCH_REPLAY
will be marked. after all the process is done. it will roll back all the
commits.

now kernel use list_add_tail to add trans to commit, and use
list_for_each_entry_safe to roll back. which means the order of adding
and rollback is the same. that will cause some cases cannot work well,
even trigger call trace, like:

1. add a set into table foo  [return -EAGAIN]:
   commit_list = 'add set trans'
2. del foo:
   commit_list = 'add set trans' -> 'del set trans' -> 'del tab trans'
then nf_tables_abort will be called to roll back:
firstly process 'add set trans':
                   case NFT_MSG_NEWSET:
                        trans->ctx.table->use--;
                        list_del_rcu(&nft_trans_set(trans)->list);

  it will del the set from the table foo, but it has removed when del
  table foo [step 2], then the kernel will panic.

the right order of rollback should be:
  'del tab trans' -> 'del set trans' -> 'add set trans'.
which is opposite with commit_list order.

so fix it by rolling back commits with reverse order in nf_tables_abort.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-13 22:47:32 +01:00
Linus Torvalds
fc89182834 NFS client bugfix for Linux 4.4
Bugfixes:
 - SUNRPC: Fix a NFSv4.1 callback channel regression
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWba3xAAoJEGcL54qWCgDyMLQQAJKU4s513LiYJ9UDil5Q+sfP
 B4flTt/uH1v3MLX31J9Z987jFNsqd9sGaw4E+03xrZNZRY5gToG7iko2im2S6YlW
 E6+yoK45JGZGbJVIMx1pUdzEuBlwtpn+kivrPEte1veJfw5LFwL8NbLjd4Kz1JXi
 h38Wv6OEvrHJJCWtkHjSVSj1ediqgULq11pHYF2kgOctLPcwMlO7XqwX6EDs2G0T
 lrJn6lK0J+0ULOTaf6OH1jdvCj30AfqpvbrT+BTxUnfzLNFWLNn8f0j8b7QRe/lM
 enmAq/1seK2S9v//D5qDcuNcuH41lhyGNfQsduJE8w2XOlYgbDWT0LIPNQr6XWLW
 DkHhuNA4N7TrCRKy07DEQTwR1+oaONX1z4N/cK73K8z+LkF4V5aQVbpYC8NU88+U
 /78Zjtht8gcYwKeEC2fTll1nufVbkbiWINQeMIXYauheOlB+hmyCm6KZ9EdX8AZS
 ItWJcf+n9Mp5Uu5tjeVquifymr5smZzgM9pRXnMljrhr/bqUwecy23lFmgiz4L4B
 pTUggOXgOu2Zs6K699wvaeZVpUv0mt29JDjB4bDIUBaMLDFy9l4L83HKfX3dUtHQ
 DpchaLjrQN57KpwWMmILxjC9u4yPv3+KRRjNZJiBP6+NEfeQO2iNl1ZoH2XRKHOR
 c4ZPFBuKSFdO1zwrdZHc
 =55Qy
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfix from Trond Myklebust:
 "SUNRPC: Fix a NFSv4.1 callback channel regression"

* tag 'nfs-for-4.4-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Fix callback channel
2015-12-13 12:46:04 -08:00
Robert Shearman
f20367df1a mpls: make via address optional for multipath routes
The via address is optional for a single path route, yet is mandatory
when the multipath attribute is used:

  # ip -f mpls route add 100 dev lo
  # ip -f mpls route add 101 nexthop dev lo
  RTNETLINK answers: Invalid argument

Make them consistent by making the via address optional when the
RTA_MULTIPATH attribute is being parsed so that both forms of
specifying the route work.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:43:44 -05:00
Robert Shearman
eb7809f093 mpls: fix out-of-bounds access when via address not specified
When a via address isn't specified, the via table is left initialised
to 0 (NEIGH_ARP_TABLE), and the via address length also left
initialised to 0. This results in a via address array of length 0
being allocated (contiguous with route and nexthop array), meaning
that when a packet is sent using neigh_xmit the neighbour lookup and
creation will cause an out-of-bounds access when accessing the 4 bytes
of the IPv4 address it assumes it has been given a pointer to.

This could be fixed by allocating the 4 bytes of via address necessary
and leaving it as all zeroes. However, it seems wrong to me to use an
ipv4 nexthop (including possibly ARPing for 0.0.0.0) when the user
didn't specify to do so.

Instead, set the via address table to NEIGH_NR_TABLES to signify it
hasn't been specified and use this at forwarding time to signify a
neigh_xmit using an L2 address consisting of the device address. This
mechanism is the same as that used for both ARP and ND for loopback
interfaces and those flagged as no-arp, which are all we can really
support in this case.

Fixes: cf4b24f002 ("mpls: reduce memory usage of routes")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:43:44 -05:00
Robert Shearman
72dcac96c7 mpls: don't dump RTA_VIA attribute if not specified
The problem seen is that when adding a route with a nexthop with no
via address specified, iproute2 generates bogus output:

  # ip -f mpls route add 100 dev lo
  # ip -f mpls route list
  100 via inet 0.0.8.0 dev lo

The reason for this is that the kernel generates an RTA_VIA attribute
with the family set to AF_INET, but the via address data having zero
length. The cause of family being AF_INET is that on route insert
cfg->rc_via_table is left set to 0, which just happens to be
NEIGH_ARP_TABLE which is then translated into AF_INET.

iproute2 doesn't validate the length prior to printing and so prints
garbage. Although it could be fixed to do the validation, I would
argue that AF_INET addresses should always be exactly 4 bytes so the
kernel is really giving userspace bogus data.

Therefore, avoid generating the RTA_VIA attribute when dumping the
route if the via address wasn't specified on add/modify. This is
indicated by NEIGH_ARP_TABLE and a zero via address length - if the
user specified a via address the address length would have been
validated such that it was 4 bytes. Although this is a change in
behaviour that is visible to userspace, I believe that what was
generated before was invalid and as such userspace wouldn't be
expecting it.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:43:44 -05:00
Robert Shearman
a3e948e83a mpls: validate L2 via address length
If an L2 via address for an mpls nexthop is specified, the length of
the L2 address must match that expected by the output device,
otherwise it could access memory beyond the end of the via address
buffer in the route.

This check was present prior to commit f8efb73c97 ("mpls: multipath
route support"), but got lost in the refactoring, so add it back,
applying it to all nexthops in multipath routes.

Fixes: f8efb73c97 ("mpls: multipath route support")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12 00:43:44 -05:00
Joe Stringer
d110986c5d openvswitch: Respect conntrack zone even if invalid
If userspace executes ct(zone=1), and the connection tracker determines
that the packet is invalid, then the ct_zone flow key field is populated
with the default zone rather than the zone that was specified. Even
though connection tracking failed, this field should be updated with the
value that the action specified. Fix the issue.

Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 23:31:31 -05:00
Joe Stringer
2f3ab9f9fc openvswitch: Fix helper reference leak
If the actions (re)allocation fails, or the actions list is larger than the
maximum size, and the conntrack action is the last action when these
problems are hit, then references to helper modules may be leaked. Fix
the issue.

Fixes: cae3a26275 ("openvswitch: Allow attaching helpers to ct action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 23:31:31 -05:00
Eric Dumazet
9470e24f35 ipv6: sctp: clone options to avoid use after free
SCTP is lacking proper np->opt cloning at accept() time.

TCP and DCCP use ipv6_dup_options() helper, do the same
in SCTP.

We might later factorize this code in a common helper to avoid
future mistakes.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 20:18:40 -05:00
Roopa Prabhu
6e71b29908 mpls_iptunnel: add static qualifier to mpls_output
This gets rid of the following compile warn:
net/mpls/mpls_iptunnel.c:40:5: warning: no previous prototype for
mpls_output [-Wmissing-prototypes]

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 20:17:43 -05:00
Eric Dumazet
d188ba86dd xfrm: add rcu protection to sk->sk_policy[]
XFRM can deal with SYNACK messages, sent while listener socket
is not locked. We add proper rcu protection to __xfrm_sk_clone_policy()
and xfrm_sk_policy_lookup()

This might serve as the first step to remove xfrm.xfrm_policy_lock
use in fast path.

Fixes: fa76ce7328 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 19:22:06 -05:00
Eric Dumazet
56f047305d xfrm: add rcu grace period in xfrm_policy_destroy()
We will soon switch sk->sk_policy[] to RCU protection,
as SYNACK packets are sent while listener socket is not locked.

This patch simply adds RCU grace period before struct xfrm_policy
freeing, and the corresponding rcu_head in struct xfrm_policy.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11 19:22:06 -05:00
Danny Schweizer
4ada1282d8 Bluetooth: Do not filter multicast addresses by default
A Linux PC is connected with another device over Bluetooth PAN using a
BNEP interface.

Whenever a packet is tried to be sent over the BNEP interface, the
function "bnep_net_xmit()" in "net/bluetooth/bnep/netdev.c" is called.
This function calls "bnep_net_mc_filter()", which checks (if the
destination address is multicast) if the address is set in a certain
multicast filter (&s->mc_filter). If it is not, then it is not sent out.

This filter is only changed in two other functions, found in
net/bluetooth/bnep/core.c": in "bnep_ctrl_set_mc_filter()", which is
only called if a message of type "BNEP_FILTER_MULTI_ADDR_SET" is
received. Otherwise, it is set in "bnep_add_connection()", where it is
set to a default value which only adds the broadcast address to the
filter:

set_bit(bnep_mc_hash(dev->broadcast), (ulong *) &s->mc_filter);

To sum up, if the BNEP interface does not receive any message of type
"BNEP_FILTER_MULTI_ADDR_SET", it will not send out any messages with
multicast destination addresses except for broadcast.

However, in the BNEP specification (page 27 in
http://grouper.ieee.org/groups/802/15/Bluetooth/BNEP.pdf), it is said
that per default, all multicast addresses should not be filtered, i.e.
the BNEP interface should be able to send packets with any multicast
destination address.

It seems that the default case is wrong: the multicast filter should not
block almost all multicast addresses, but should not filter out any.

This leads to the problem that e.g. Neighbor Solicitation messages sent
with Bluetooth PAN over the BNEP interface to a multicast destination
address other than broadcast are blocked and not sent out.

Therefore, in the default case, we set the mc_filter to ~0LL to not
filter out any multicast addresses.

Signed-off-by: Danny Schweizer <danny.schweizer@proofnet.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-11 10:46:16 +01:00
Alexander Aring
c38383530f mac802154: tx: fix synced xmit deadlock
This patch reverts 6001d52 ("mac802154: tx: don't allow if down while
sync tx"). This has side effects with stop callback which flush the
transmit workqueue. The stop callback will wait until the workqueue is
flushed and holding the rtnl lock. That means it can happen that the stop
callback waits forever because it try to lock the rtnl mutex which is
already hold by stop callback.

Cc: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 19:17:11 +01:00
Pablo Neira Ayuso
d3340b79ec netfilter: nf_dup: add missing dependencies with NF_CONNTRACK
CONFIG_NF_CONNTRACK=m
CONFIG_NF_DUP_IPV4=y

results in:

   net/built-in.o: In function `nf_dup_ipv4':
>> (.text+0xd434f): undefined reference to `nf_conntrack_untracked'

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-10 18:17:06 +01:00
Pablo Neira Ayuso
bd678e09dc netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in skbuff clones
If we attach the sk to the skb from nfnetlink_rcv_batch(), then
netlink_skb_destructor() will underflow the socket receive memory
counter and we get warning splat when releasing the socket.

$ cat /proc/net/netlink
sk       Eth Pid    Groups   Rmem     Wmem     Dump     Locks     Drops     Inode
ffff8800ca903000 12  0      00000000 -54144   0        0 2        0        17942
                                     ^^^^^^

Rmem above shows an underflow.

And here below the warning splat:

[ 1363.815976] WARNING: CPU: 2 PID: 1356 at net/netlink/af_netlink.c:958 netlink_sock_destruct+0x80/0xb9()
[...]
[ 1363.816152] CPU: 2 PID: 1356 Comm: kworker/u16:1 Tainted: G        W       4.4.0-rc1+ #153
[ 1363.816155] Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
[ 1363.816160] Workqueue: netns cleanup_net
[ 1363.816163]  0000000000000000 ffff880119203dd0 ffffffff81240204 0000000000000000
[ 1363.816169]  ffff880119203e08 ffffffff8104db4b ffffffff813d49a1 ffff8800ca771000
[ 1363.816174]  ffffffff81a42b00 0000000000000000 ffff8800c0afe1e0 ffff880119203e18
[ 1363.816179] Call Trace:
[ 1363.816181]  <IRQ>  [<ffffffff81240204>] dump_stack+0x4e/0x79
[ 1363.816193]  [<ffffffff8104db4b>] warn_slowpath_common+0x9a/0xb3
[ 1363.816197]  [<ffffffff813d49a1>] ? netlink_sock_destruct+0x80/0xb9

skb->sk was only needed to lookup for the netns, however we don't need
this anymore since 633c9a840d ("netfilter: nfnetlink: avoid recurrent
netns lookups in call_batch") so this patch removes this manual socket
assignment to resolve this problem.

Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-12-10 18:16:29 +01:00
Pablo Neira Ayuso
633c9a840d netfilter: nfnetlink: avoid recurrent netns lookups in call_batch
Pass the net pointer to the call_batch callback functions so we can skip
recurrent lookups.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-12-10 13:49:24 +01:00
Johannes Berg
b7bb110008 rfkill: copy the name into the rfkill struct
Some users of rfkill, like NFC and cfg80211, use a dynamic name when
allocating rfkill, in those cases dev_name(). Therefore, the pointer
passed to rfkill_alloc() might not be valid forever, I specifically
found the case that the rfkill name was quite obviously an invalid
pointer (or at least garbage) when the wiphy had been renamed.

Fix this by making a copy of the rfkill name in rfkill_alloc().

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-10 10:37:51 +01:00
Alexander Aring
b1815fd949 6lowpan: add debugfs support
This patch will introduce a 6lowpan entry into the debugfs if enabled.
Inside this 6lowpan directory we create a subdirectories of all 6lowpan
interfaces to offer a per interface debugfs support.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Alexander Aring
00f5931411 6lowpan: add lowpan dev register helpers
This patch introduces register and unregister functionality for lowpan
interfaces. While register a lowpan interface there are several things
which need to be initialize by the 6lowpan subsystem. Upcoming
functionality need to register/unregister per interface components e.g.
debugfs entry.

Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
43f26e17d0 6lowpan: add nhc module for GHC routing extension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
2f4799478c 6lowpan: add nhc module for GHC fragmentation extension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
20616a5a1e 6lowpan: add nhc module for GHC destination extension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:25 +01:00
Stefan Schmidt
c39da3bb5b 6lowpan: add nhc module for GHC ICMPv6 detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Stefan Schmidt
70cc86752e 6lowpan: add nhc module for GHC UDP detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Stefan Schmidt
7e568f50c1 6lowpan: add nhc module for GHC hop-by-hopextension header detection
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Stefan Schmidt
5e5c08cbee 6lowpan: clarify Kconfig entries for upcoming GHC support
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 01:25:24 +01:00
Yichen Zhao
1a11ec89db Bluetooth: Fix locking in bt_accept_dequeue after disconnection
Fix a crash that may happen when bt_accept_dequeue is run after a
Bluetooth connection has been disconnected. bt_accept_unlink was called
after release_sock, permitting bt_accept_unlink to run twice on the same
socket and cause a NULL pointer dereference.

[50510.241632] BUG: unable to handle kernel NULL pointer dereference at 00000000000001a8
[50510.241694] IP: [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
[50510.241759] PGD 0
[50510.241776] Oops: 0002 [#1] SMP
[50510.241802] Modules linked in: rtl8192cu rtl_usb rtlwifi rtl8192c_common 8021q garp stp mrp llc rfcomm bnep nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 ath9k ath9k_common ath9k_hw ath kvm eeepc_wmi asus_wmi mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek sparse_keymap crct10dif_pclmul snd_hda_codec_generic crc32_pclmul snd_hda_intel snd_hda_controller cfg80211 snd_hda_codec i915 snd_hwdep snd_pcm ghash_clmulni_intel snd_timer snd soundcore serio_raw cryptd drm_kms_helper drm i2c_algo_bit shpchp ath3k mei_me lpc_ich btusb bluetooth 6lowpan_iphc mei lp parport wmi video mac_hid psmouse ahci libahci r8169 mii
[50510.242279] CPU: 0 PID: 934 Comm: krfcommd Not tainted 3.16.0-49-generic #65~14.04.1-Ubuntu
[50510.242327] Hardware name: ASUSTeK Computer INC. VM40B/VM40B, BIOS 1501 12/09/2014
[50510.242370] task: ffff8800d9068a30 ti: ffff8800d7a54000 task.ti: ffff8800d7a54000
[50510.242413] RIP: 0010:[<ffffffffc01243f7>]  [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
[50510.242480] RSP: 0018:ffff8800d7a57d58  EFLAGS: 00010246
[50510.242511] RAX: 0000000000000000 RBX: ffff880119bb8c00 RCX: ffff880119bb8eb0
[50510.242552] RDX: ffff880119bb8eb0 RSI: 00000000fffffe01 RDI: ffff880119bb8c00
[50510.242592] RBP: ffff8800d7a57d60 R08: 0000000000000283 R09: 0000000000000001
[50510.242633] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800d8da9eb0
[50510.242673] R13: ffff8800d74fdb80 R14: ffff880119bb8c00 R15: ffff8800d8da9c00
[50510.242715] FS:  0000000000000000(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
[50510.242761] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[50510.242794] CR2: 00000000000001a8 CR3: 0000000001c13000 CR4: 00000000001407f0
[50510.242835] Stack:
[50510.242849]  ffff880119bb8eb0 ffff8800d7a57da0 ffffffffc0124506 ffff8800d8da9eb0
[50510.242899]  ffff8800d8da9c00 ffff8800d9068a30 0000000000000000 ffff8800d74fdb80
[50510.242949]  ffff8800d6f85208 ffff8800d7a57e08 ffffffffc0159985 000000000000001f
[50510.242999] Call Trace:
[50510.243027]  [<ffffffffc0124506>] bt_accept_dequeue+0xb6/0x180 [bluetooth]
[50510.243085]  [<ffffffffc0159985>] l2cap_sock_accept+0x125/0x220 [bluetooth]
[50510.243128]  [<ffffffff810a1b30>] ? wake_up_state+0x20/0x20
[50510.243163]  [<ffffffff8164946e>] kernel_accept+0x4e/0xa0
[50510.243200]  [<ffffffffc05b97cd>] rfcomm_run+0x1ad/0x890 [rfcomm]
[50510.243238]  [<ffffffffc05b9620>] ? rfcomm_process_rx+0x8a0/0x8a0 [rfcomm]
[50510.243281]  [<ffffffff81091572>] kthread+0xd2/0xf0
[50510.243312]  [<ffffffff810914a0>] ? kthread_create_on_node+0x1c0/0x1c0
[50510.243353]  [<ffffffff8176e9d8>] ret_from_fork+0x58/0x90
[50510.243387]  [<ffffffff810914a0>] ? kthread_create_on_node+0x1c0/0x1c0
[50510.243424] Code: 00 48 8b 93 b8 02 00 00 48 8d 83 b0 02 00 00 48 89 51 08 48 89 0a 48 89 83 b0 02 00 00 48 89 83 b8 02 00 00 48 8b 83 c0 02 00 00 <66> 83 a8 a8 01 00 00 01 48 c7 83 c0 02 00 00 00 00 00 00 f0 ff
[50510.243685] RIP  [<ffffffffc01243f7>] bt_accept_unlink+0x47/0xa0 [bluetooth]
[50510.243737]  RSP <ffff8800d7a57d58>
[50510.243758] CR2: 00000000000001a8
[50510.249457] ---[ end trace bb984f932c4e3ab3 ]---

Signed-off-by: Yichen Zhao <zhaoyichen@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:51 +01:00
Johan Hedberg
acb9f911ea Bluetooth: Don't treat connection timeout as a failure
When we're doing background scanning and connection attempts it's
possible we timeout trying to connect and go back to scanning again.
The timeout triggers a HCI_LE_Create_Connection_Cancel which will
trigger a Connection Complete with "Unknown Connection Identifier"
error status. Since we go back to scanning this isn't really a failure
and shouldn't be presented as such to user space through mgmt.

The exception to this is if the connection attempt was due to an
explicit request on an L2CAP socket (indicated by
params->explicit_connect being true). Since the socket will get an
error it's consistent to also notify the failure on mgmt in this case.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:51 +01:00
Johan Hedberg
2f99536a5b Bluetooth: Use continuous scanning when creating LE connections
All LE connections are now triggered through a preceding passive scan
and waiting for a connectable advertising report. This means we've got
the best possible guarantee that the device is within range and should
be able to request the controller to perform continuous scanning. This
way we minimize the risk that we miss out on any advertising packets.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # 4.3+
2015-12-10 00:51:51 +01:00
Johan Hedberg
cab054ab47 Bluetooth: Clean up current advertising instance tracking
We can simplify a lot of code by making sure hdev->cur_adv_instance is
always up-to-date. This allows e.g. the removal of the
get_current_adv_instance() helper function and the special
HCI_ADV_CURRENT value. This patch also makes selecting instance 0x00
explicit in the various calls where advertising instances aren't
enabled, e.g. when HCI_ADVERTISING is set or we've just finished
enabling LE.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:50 +01:00
Johan Hedberg
d6b7e2cddb Bluetooth: Clean up advertising initialization in powered_update_hci()
The logic in powered_update_hci() to initialize the advertising data &
state is a bit more complicated than it needs to be. It was previously
not doing anything if HCI_LE_ENABLED wasn't set, but this was not
obvious by quickly looking at the code. Now the conditions for the
various actions are more explicit. Another simplification is due to
the fact that __hci_req_schedule_adv_instance() takes care of setting
hdev->cur_adv_instance so there's no need to set it before calling the
function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:50 +01:00
Johan Hedberg
550a8ca765 Bluetooth: Remove redundant check for req.cmd_q
The hci_req_run() function already checks for empty cmd_q and bails
out if necessary. Also, req.cmd_q should really be treated as private
data of the request and not accessed directly.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
d6dac32e84 Bluetooth: Fix updating wrong instance's scan_rsp data
The __hci_req_update_scan_rsp_data gets the instance to be updated
which should get passed to update_inst_scan_rsp_data() instead of
always enabling the current instance.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
17fd08ffb5 Bluetooth: Remove unnecessary HCI_ADVERTISING_INSTANCE flag
This flag just tells us whether hdev->adv_instances is empty or not.
We can equally well use the list_empty() function to get this
information.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
02c04afea9 Bluetooth: Simplify read_adv_features code
The code in the Read Advertising Features mgmt command handler is
unnecessarily complicated. Clean it up and remove unnecessary
variables & branches.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
2ff13894cf Bluetooth: Perform HCI update for power on synchronously
The request to update HCI during power on is always coming either from
hdev->req_workqueue or through an ioctl, so it's safe to use
hci_req_sync for it. This way we also eliminate potential races with
incoming mgmt commands or other actions while powering on.

Part of this refactoring is the splitting of mgmt_powered() into
mgmt_power_on() and __mgmt_power_off() functions. The main reason is
the different requirements as far as hdev locking is concerned, as
highlighted with the __ prefix of the power off API.

Since the power on in the case of clearing the AUTO_OFF flag cannot be
done synchronously in the set_powered mgmt handler, the hci_power_on
work callback is extended to cover this (which also simplifies the
set_powered helper a lot).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:49 +01:00
Johan Hedberg
bf943cbf76 Bluetooth: Move fast connectable code to hci_request.c
We'll soon need this both in hci_request.c and mgmt.c so move it to
hci_request.c as a generic helper.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
b1a8917c9b Bluetooth: Move EIR update to hci_request.c
We'll soon need to update the EIR both from hci_request.c and mgmt.c
so move update_eir() as a more generic request helper to
hci_request.c.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
00cf5040b3 Bluetooth: HCI name update to hci_request.c
We'll soon need this both from hci_request.c and mgmt.c so move it as
a request helper function to hci_request.c.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
c366f555b8 Bluetooth: Move discoverable timeout behind hdev->req_workqueue
Since the other discoverable changes are behind req_workqueue now it
only makes sense to move the discoverable timeout there as well.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
aed1a8851d Bluetooth: Move discoverable changes to hdev->req_workqueue
The discoverable mode is intrinsically linked with the connectable
mode e.g. through sharing the same HCI command (Write Scan Enable) for
BR/EDR. It makes therefore sense to move it to hci_request.c and run
the changes through the same hdev->req_workqueue.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
14bf5eac7a Bluetooth: Perform Class of Device changes through hdev->req_workqueue
The Class of Device needs to be changed e.g. for limited discoverable
mode. In preparation of moving the discoverable mode to hci_request.c
and hdev->req_workqueue, move the Class of Device helpers there first.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
53c0ba7451 Bluetooth: Move connectable changes to hdev->req_workqueue
This way the connectable changes are synchronized against each other,
which helps avoid potential races. The connectable mode is also linked
together with LE advertising which makes is more convenient to have it
behind the same workqueue.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:48 +01:00
Johan Hedberg
f22525700b Bluetooth: Move advertising instance management to hci_request.c
This paves the way for eventually performing advertising changes
through the hdev->req_workqueue. Some new APIs need to be exposed from
mgmt.c to hci_request.c and vice-versa, but many of them will go away
once hdev->req_workqueue gets used.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:47 +01:00
Johan Hedberg
196a5e97d1 Bluetooth: Move __hci_update_background_scan up in hci_request.c
This way we avoid the need to do a forward declaration in later
patches.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:47 +01:00
Johan Hedberg
01b1cb87d3 Bluetooth: Run page scan updates through hdev->req_workqueue
Since Add/Remove Device perform the page scan updates independently
from the HCI command completion we've introduced a potential race when
multiple mgmt commands are queued. Doing the page scan updates through
the req_workqueue ensures that the state changes are performed in a
race-free manner.

At the same time, to make the request helper more widely usable,
extend it to also cover Inquiry Scan changes since those are behind
the same HCI command. This is also reflected in the new name of the
API as well as the work struct name.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-12-10 00:51:47 +01:00
Florian Westphal
9fb0b519c7 netfilter: nf_tables: fix nf_log_trace based tracing
nf_log_trace() outputs bogus 'TRACE:' strings because I forgot to update
the comments array.

Fixes: 33d5a7b14b ("netfilter: nf_tables: extend tracing infrastructure")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 16:53:46 +01:00
Rosen, Rami
23509fcd4e netfilter: nfnetlink_log: Change setter functions to be void
Change return type of nfulnl_set_timeout() and nfulnl_set_qthresh() to
be void.

This patch changes the return type of the static methods
nfulnl_set_timeout() and nfulnl_set_qthresh() to be void, as there is no
justification and no need for these methods to return int.

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 14:52:56 +01:00
Nikolay Borisov
639e077b43 netfilter: nfnetlink_queue: Unregister pernet subsys in case of init failure
Commit 3bfe049807 ("netfilter: nfnetlink_{log,queue}:
Register pernet in first place") reorganised the initialisation
order of the pernet_subsys to avoid "use-before-initialised"
condition. However, in doing so the cleanup logic in nfnetlink_queue
got botched in that the pernet_subsys wasn't cleaned in case
nfnetlink_subsys_register failed. This patch adds the necessary
cleanup routine call.

Fixes: 3bfe049807 ("netfilter: nfnetlink_{log,queue}: Register pernet in first place")
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 14:46:47 +01:00
Florian Westphal
e97ac12859 netfilter: ipv6: nf_defrag: fix NULL deref panic
Valdis reports NULL deref in nf_ct_frag6_gather.
Problem is bogus use of skb_queue_walk() -- we miss first skb in the list
since we start with head->next instead of head.

In case the element we're looking for was head->next we won't find
a result and then trip over NULL iter.

(defrag uses plain NULL-terminated list rather than one terminated by
 head-of-list-pointer, which is what skb_queue_walk expects).

Fixes: 029f7f3b87 ("netfilter: ipv6: nf_defrag: avoid/free clone operations")
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 14:26:31 +01:00
Florian Westphal
e639f7ab07 netfilter: nf_tables: wrap tracing with a static key
Only needed when meta nftrace rule(s) were added.
The assumption is that no such rules are active, so the call to
nft_trace_init is "never" needed.

When nftrace rules are active, we always call the nft_trace_* functions,
but will only send netlink messages when all of the following are true:

 - traceinfo structure was initialised
 - skb->nf_trace == 1
 - at least one subscriber to trace group.

Adding an extra conditional
(static_branch ... && skb->nf_trace)
	nft_trace_init( ..)

Is possible but results in a larger nft_do_chain footprint.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 13:23:13 +01:00
Florian Westphal
33d5a7b14b netfilter: nf_tables: extend tracing infrastructure
nft monitor mode can then decode and display this trace data.

Parts of LL/Network/Transport headers are provided as separate
attributes.

Otherwise, printing IP address data becomes virtually impossible
for userspace since in the case of the netdev family we really don't
want userspace to have to know all the possible link layer types
and/or sizes just to display/print an ip address.

We also don't want userspace to have to follow ipv6 header chains
to get the s/dport info, the kernel already did this work for us.

To avoid bloating nft_do_chain all data required for tracing is
encapsulated in nft_traceinfo.

The structure is initialized unconditionally(!) for each nft_do_chain
invocation.

This unconditionall call will be moved under a static key in a
followup patch.

With lots of help from Patrick McHardy and Pablo Neira.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-12-09 13:18:37 +01:00
Tejun Heo
bd1060a1d6 sock, cgroup: add sock->sk_cgroup
In cgroup v1, dealing with cgroup membership was difficult because the
number of membership associations was unbound.  As a result, cgroup v1
grew several controllers whose primary purpose is either tagging
membership or pull in configuration knobs from other subsystems so
that cgroup membership test can be avoided.

net_cls and net_prio controllers are examples of the latter.  They
allow configuring network-specific attributes from cgroup side so that
network subsystem can avoid testing cgroup membership; unfortunately,
these are not only cumbersome but also problematic.

Both net_cls and net_prio aren't properly hierarchical.  Both inherit
configuration from the parent on creation but there's no interaction
afterwards.  An ancestor doesn't restrict the behavior in its subtree
in anyway and configuration changes aren't propagated downwards.
Especially when combined with cgroup delegation, this is problematic
because delegatees can mess up whatever network configuration
implemented at the system level.  net_prio would allow the delegatees
to set whatever priority value regardless of CAP_NET_ADMIN and net_cls
the same for classid.

While it is possible to solve these issues from controller side by
implementing hierarchical allowable ranges in both controllers, it
would involve quite a bit of complexity in the controllers and further
obfuscate network configuration as it becomes even more difficult to
tell what's actually being configured looking from the network side.
While not much can be done for v1 at this point, as membership
handling is sane on cgroup v2, it'd be better to make cgroup matching
behave like other network matches and classifiers than introducing
further complications.

In preparation, this patch updates sock->sk_cgrp_data handling so that
it points to the v2 cgroup that sock was created in until either
net_prio or net_cls is used.  Once either of the two is used,
sock->sk_cgrp_data reverts to its previous role of carrying prioidx
and classid.  This is to avoid adding yet another cgroup related field
to struct sock.

As the mode switching can happen at most once per boot, the switching
mechanism is aimed at lowering hot path overhead.  It may leak a
finite, likely small, number of cgroup refs and report spurious
prioidx or classid on switching; however, dynamic updates of prioidx
and classid have always been racy and lossy - socks between creation
and fd installation are never updated, config changes don't update
existing sockets at all, and prioidx may index with dead and recycled
cgroup IDs.  Non-critical inaccuracies from small race windows won't
make any noticeable difference.

This patch doesn't make use of the pointer yet.  The following patch
will implement netfilter match for cgroup2 membership.

v2: Use sock_cgroup_data to avoid inflating struct sock w/ another
    cgroup specific field.

v3: Add comments explaining why sock_data_prioidx() and
    sock_data_classid() use different fallback values.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 22:02:33 -05:00
Tejun Heo
2a56a1fec2 net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct
Introduce sock->sk_cgrp_data which is a struct sock_cgroup_data.
->sk_cgroup_prioidx and ->sk_classid are moved into it.  The struct
and its accessors are defined in cgroup-defs.h.  This is to prepare
for overloading the fields with a cgroup pointer.

This patch mostly performs equivalent conversions but the followings
are noteworthy.

* Equality test before updating classid is removed from
  sock_update_classid().  This shouldn't make any noticeable
  difference and a similar test will be implemented on the helper side
  later.

* sock_update_netprioidx() now takes struct sock_cgroup_data and can
  be moved to netprio_cgroup.h without causing include dependency
  loop.  Moved.

* The dummy version of sock_update_netprioidx() converted to a static
  inline function while at it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 22:02:33 -05:00
Tejun Heo
297dbde19c netprio_cgroup: limit the maximum css->id to USHRT_MAX
netprio builds per-netdev contiguous priomap array which is indexed by
css->id.  The array is allocated using kzalloc() effectively limiting
the maximum ID supported to some thousand range.  This patch caps the
maximum supported css->id to USHRT_MAX which should be way above what
is actually useable.

This allows reducing sock->sk_cgrp_prioidx to u16 from u32.  The freed
up part will be used to overload the cgroup related fields.
sock->sk_cgrp_prioidx's position is swapped with sk_mark so that the
two cgroup related fields are adjacent.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 22:02:33 -05:00
Stefan Hajnoczi
8ac2837c89 Revert "Merge branch 'vsock-virtio'"
This reverts commit 0d76d6e8b2 and merge
commit c402293bd7, reversing changes made
to c89359a42e.

The virtio-vsock device specification is not finalized yet.  Michael
Tsirkin voiced concerned about merging this code when the hardware
interface (and possibly the userspace interface) could still change.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 21:55:49 -05:00
Rainer Weikusat
760a432247 net: Fix inverted test in __skb_recv_datagram
As the kernel generally uses negated error numbers, *err needs to be
compared with -EAGAIN (d'oh).

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Fixes: ea3793ee29 ("core: enable more fine-grained datagram reception control")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08 11:30:17 -05:00
Eric Dumazet
bd5eb35f16 xfrm: take care of request sockets
TCP SYNACK messages might now be attached to request sockets.

XFRM needs to get back to a listener socket.

Adds new helpers that might be used elsewhere :
sk_to_full_sk() and sk_const_to_full_sk()

Note: We also need to add RCU protection for xfrm lookups,
now TCP/DCCP have lockless listener processing. This will
be addressed in separate patches.

Fixes: ca6fb06518 ("tcp: attach SYNACK messages to request sockets instead of listener")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 17:07:33 -05:00
Eric Dumazet
69ce6487dc ipv6: sctp: fix lockdep splat in sctp_v6_get_dst()
While cooking the sctp np->opt rcu fixes, I forgot to move
one rcu_read_unlock() after the added rcu_dereference() in
sctp_v6_get_dst()

This gave lockdep warnings reported by Dave Jones.

Fixes: c836a8ba93 ("ipv6: sctp: add rcu protection around np->opt")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 17:07:33 -05:00
David S. Miller
0c9cd7c433 Included changes:
- prevent compatibility issue between DAT and speedy join from creating
   inconsistencies in the global translation table
 - make sure temporary TT entries are purged out if not claimed
 - fix comparison function used for TT hash table
 - fix invalid stack access in batadv_dat_select_candidates()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWZZqMAAoJENpFlCjNi1MRPg4P/Rq04JLU51tzIYnC7dmkcKjY
 L0qFTF4C4FjRzyeDH7d7GwfDJAu/AN/sCw/N0UzkBQ8qlGmtM5pkyTglr12w2axT
 V2LalU8nR+VIqL/DvCmMAPthwRsXfAXAkwiKEPCbKp9GHOKuVF9uIC+UQP0V0HQE
 6qKVii1j/IhVJWa30LrP8Uk/W4Ji55oB/oGBH4ggbWf0YWnaSldldfGGjtEY0JVd
 dfcOyTCYX3NiGs9QB5HRz2NDhc0t1u0CWTTaM1n5w7l2ejlUH6JYwqOfAr5Q5STt
 C265KonUeZCfq/ktzcEZ9c0NWHy8OePHA2anO1e5zPV/TwXNsSb50AnAHslaA7Fd
 vgZ3svTjhxowU20P21gMYOJGlUpmjMx+klBEcd0rAKEZHjcq2IFLS4LA6I8en8o4
 XtcJLj1Rrk/4XV9BntOlLVIfrA8TKH7B/vjAkbLa45WbH74B1V1sODmGxIv4mUVs
 Vhe5lTyPktKFhducU0g4jDIg+Up07JIrMKs9/9TXWXuXf4tH255yg42dYSCOoqQD
 4haKbUqQN90KvR1/cKGtkwpq0BlZq75RW4+SFtsUUZbdZyZNGx2zgR8g5e1owdN2
 1I4c2XAsTbJMEaB86hsNWv/goy0H57dn4SNwi34BJr6OADfcorr2qGd4LRrTDUaL
 hpwirVIN0SiaEz9krwDW
 =tH3+
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Antonio Quartulli says:

====================
Included changes:
- prevent compatibility issue between DAT and speedy join from creating
  inconsistencies in the global translation table
- make sure temporary TT entries are purged out if not claimed
- fix comparison function used for TT hash table
- fix invalid stack access in batadv_dat_select_candidates()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:59:19 -05:00
Neil Armstrong
cda5c15b23 net: dsa: move dsa slave destroy code to slave.c
Move dsa slave dedicated code from dsa_switch_destroy to a new
dsa_slave_destroy function in slave.c.
Add the netif_carrier_off and phy_disconnect calls in order to
correctly cleanup the netdev state and PHY state machine.

Signed-off-by: Frode Isaksen <fisaksen@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:50 -05:00
Neil Armstrong
679fb46c57 net: dsa: Add missing master netdev dev_put() calls
Upon probe failure or unbinding, add missing dev_put() calls.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:50 -05:00
Neil Armstrong
b0dc635d92 net: dsa: cleanup resources upon module removal
Make sure that we unassign the master_netdev dsa_ptr to make the packet
processing go through the regular Ethernet receive path.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:50 -05:00
Neil Armstrong
4baee937b8 net: dsa: remove DSA link polling
Since no more DSA driver uses the polling callback, and since
the phylib handles the link detection, remove the link polling
work and timer code.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:35:49 -05:00
Robert Shearman
fe82b3300e mpls: fix sending of local encapped packets
Locally generated IPv4 and (probably) IPv6 packets are dropped because
skb->protocol isn't set. We could write wrappers to lwtunnel_output
for IPv4 and IPv6 that set the protocol accordingly and then call
lwtunnel_output, but mpls_output relies on the AF-specific type of dst
anyway to get the via address.

Therefore, make use of dst->dst_ops->family in mpls_output to
determine the type of nexthop and thus protocol of the packet instead
of checking skb->protocol.

Fixes: 61adedf3e3 ("route: move lwtunnel state to dst_entry")
Reported-by: Sam Russell <sam.h.russell@gmail.com>
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 16:32:47 -05:00
Trond Myklebust
756b9b37cf SUNRPC: Fix callback channel
The NFSv4.1 callback channel is currently broken because the receive
message will keep shrinking because the backchannel receive buffer size
never gets reset.
The easiest solution to this problem is instead of changing the receive
buffer, to rather adjust the copied request.

Fixes: 38b7631fbe ("nfs4: limit callback decoding to received bytes")
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-07 13:04:59 -08:00
David S. Miller
ad9360b3e5 This pull request got a bit bigger than I wanted, due to
needing to reshuffle and fix some bugs. I merged mac80211
 to get the right base for some of these changes.
 
  * new mac80211 API for upcoming driver changes: EOSP handling,
    key iteration
  * scan abort changes allowing to cancel an ongoing scan
  * VHT IBSS 80+80 MHz support
  * re-enable full AP client state tracking after fixes
  * various small fixes (that weren't relevant for mac80211)
  * various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJWZVw7AAoJEGt7eEactAAdQcgP/1bOBBKgCHWZ8xhqmhLIPPUP
 AgkkyBcjCbSOWyE1axm5WQZM+fQvyGAcYsnhsK7h0Wy5Jvv6goNYhxkoD3L5lAKC
 LkiiqokTpLx1Em6Iugn1sdgag8q7EquYYQN+hOEOWtp32pTsx3/pDglCtGu0SX1N
 eystHEAu6mzPezat99M4s80fRlfBop3yaUuL5XopQFGtU37zfUgoXJB3BoXgxNjK
 XyD22jtPDreDMndZ9ugfvMaiq3iKRBhKXqgGb3SqMaStIyRK8zAkHb5jg3CllMeq
 bEsz4Rb4r+vtm2AVsUMWjfd/upQKwPwuvdvCvv4AQCO+aR9Rm+tR/wnnD4Gtnek5
 zPQ6XWt/0V4CKGl+W9shnDSA1DZ3hTijJlaGsK+RUqEtdq903lEP7fc2GsSvlund
 jXHfOExieuZOToKWTKpmNGsCw6fjJaGXNd/iLWo5VGAZS2X+JLmFZ94g43a6zOGZ
 s1Gz4F3tz4u4Bd26NAK2Z6CQRvDS4OOyLIjl9vpB9Fk/9nQx3f7WD8aBTRuCVAtG
 U2sFEUscz3rkdct30Gvkjm3ovmgc4pomTDvOpmNIsSCi2ygzGWHbEvSrrHdIjzVy
 KDcvRs6bRtCL/WxaaEIk46M6+6aKlSnZytPLl7vkNnvxXuEF7GYdnNVSUbSH9Nte
 XzT4+rZRiqyPZEGhBekw
 =+5dd
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
This pull request got a bit bigger than I wanted, due to
needing to reshuffle and fix some bugs. I merged mac80211
to get the right base for some of these changes.

 * new mac80211 API for upcoming driver changes: EOSP handling,
   key iteration
 * scan abort changes allowing to cancel an ongoing scan
 * VHT IBSS 80+80 MHz support
 * re-enable full AP client state tracking after fixes
 * various small fixes (that weren't relevant for mac80211)
 * various cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07 14:10:10 -05:00
Tejun Heo
0b98f0c042 Merge branch 'master' into for-4.4-fixes
The following commit which went into mainline through networking tree

  3b13758f51 ("cgroups: Allow dynamically changing net_classid")

conflicts in net/core/netclassid_cgroup.c with the following pending
fix in cgroup/for-4.4-fixes.

  1f7dd3e5a6 ("cgroup: fix handling of multi-destination migration from subtree_control enabling")

The former separates out update_classid() from cgrp_attach() and
updates it to walk all fds of all tasks in the target css so that it
can be used from both migration and config change paths.  The latter
drops @css from cgrp_attach().

Resolve the conflict by making cgrp_attach() call update_classid()
with the css from the first task.  We can revive @tset walking in
cgrp_attach() but given that net_cls is v1 only where there always is
only one target css during migration, this is fine.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Nina Schiff <ninasc@fb.com>
2015-12-07 10:09:03 -05:00
Sven Eckelmann
b7fe3d4f4a batman-adv: Fix invalid stack access in batadv_dat_select_candidates
batadv_dat_select_candidates provides an u32 to batadv_hash_dat but it
needs a batadv_dat_entry with at least ip and vid filled in.

Fixes: 3e26722bc9f2 ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-07 22:40:21 +08:00
Marek Lindner
4c71895830 batman-adv: fix erroneous client entry duplicate detection
The translation table implementation, namely batadv_compare_tt(),
is used to compare two client entries and deciding if they are the
holding the same information. Each client entry is identified by
its mac address and its VLAN id (VID).
Consequently, batadv_compare_tt() has to not only compare the mac
addresses but also the VIDs.

Without this fix adding a new client entry that possesses the same
mac address as another client but operates on a different VID will
fail because both client entries will considered identical.

Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-07 22:40:21 +08:00
Simon Wunderlich
a6cb390940 batman-adv: avoid keeping false temporary entry
In the case when a temporary entry is added first and a proper tt entry
is added after that, the temporary tt entry is kept in the orig list.
However the temporary flag is removed at this point, and therefore the
purge function can not find this temporary entry anymore.

Therefore, remove the previous temp entry before adding the new proper
one.

This case can happen if a client behind a given originator moves before
the TT announcement is sent out. Other than that, this case can also be
created by bogus or malicious payload frames for VLANs which are not
existent on the sending originator.

Reported-by: Alessandro Bolletta <alessandro@mediaspot.net>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-07 21:58:14 +08:00
Simon Wunderlich
437bb0e645 batman-adv: fix speedy join for DAT cache replies
DAT Cache replies are answered on behalf of other clients which are not
connected to the answering originator. Therefore, we shouldn't add these
clients to the answering originators TT table through speed join to
avoid bogus entries.

Reported-by: Alessandro Bolletta <alessandro@mediaspot.net>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-12-07 21:58:14 +08:00
Ilan Peer
1b894521e6 mac80211: handle HW ROC expired properly
In case of HW ROC, when the driver reports that the ROC expired,
it is not sufficient to purge the ROCs based on the remaining
time, as it possible that the device finished the ROC session
before the actual requested duration.

To handle such cases, in case of ROC expired notification from
the driver, complete all the ROCs which are marked with hw_begun,
regardless of the remaining duration.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-07 11:06:37 +01:00
Rainer Weikusat
6487428088 af_unix: fix unix_dgram_recvmsg entry locking
The current unix_dgram_recvsmg code acquires the u->readlock mutex in
order to protect access to the peek offset prior to calling
__skb_recv_datagram for actually receiving data. This implies that a
blocking reader will go to sleep with this mutex held if there's
presently no data to return to userspace. Two non-desirable side effects
of this are that a later non-blocking read call on the same socket will
block on the ->readlock mutex until the earlier blocking call releases it
(or the readers is interrupted) and that later blocking read calls
will wait longer than the effective socket read timeout says they
should: The timeout will only start 'ticking' once such a reader hits
the schedule_timeout in wait_for_more_packets (core.c) while the time it
already had to wait until it could acquire the mutex is unaccounted for.

The patch avoids both by using the __skb_try_recv_datagram and
__skb_wait_for_more packets functions created by the first patch to
implement a unix_dgram_recvmsg read loop which releases the readlock
mutex prior to going to sleep and reacquires it as needed
afterwards. Non-blocking readers will thus immediately return with
-EAGAIN if there's no data available regardless of any concurrent
blocking readers and all blocking readers will end up sleeping via
schedule_timeout, thus honouring the configured socket receive timeout.

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 23:31:54 -05:00
Rainer Weikusat
ea3793ee29 core: enable more fine-grained datagram reception control
The __skb_recv_datagram routine in core/ datagram.c provides a general
skb reception factility supposed to be utilized by protocol modules
providing datagram sockets. It encompasses both the actual recvmsg code
and a surrounding 'sleep until data is available' loop. This is
inconvenient if a protocol module has to use additional locking in order
to maintain some per-socket state the generic datagram socket code is
unaware of (as the af_unix code does). The patch below moves the recvmsg
proper code into a new __skb_try_recv_datagram routine which doesn't
sleep and renames wait_for_more_packets to
__skb_wait_for_more_packets, both routines being exported interfaces. The
original __skb_recv_datagram routine is reimplemented on top of these
two functions such that its user-visible behaviour remains unchanged.

Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 23:31:54 -05:00
lucien
8a0d19c5ed sctp: start t5 timer only when peer rwnd is 0 and local state is SHUTDOWN_PENDING
when A sends a data to B, then A close() and enter into SHUTDOWN_PENDING
state, if B neither claim his rwnd is 0 nor send SACK for this data, A
will keep retransmitting this data until t5 timeout, Max.Retrans times
can't work anymore, which is bad.

if B's rwnd is not 0, it should send abort after Max.Retrans times, only
when B's rwnd == 0 and A's retransmitting beyonds Max.Retrans times, A
will start t5 timer, which is also commit f8d9605243 ("sctp: Enforce
retransmission limit during shutdown") means, but it lacks the condition
peer rwnd == 0.

so fix it by adding a bit (zero_window_announced) in peer to record if
the last rwnd is 0. If it was, zero_window_announced will be set. and use
this bit to decide if start t5 timer when local.state is SHUTDOWN_PENDING.

Fixes: commit f8d9605243 ("sctp: Enforce retransmission limit during shutdown")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 22:31:51 -05:00
lucien
8b570dc9f7 sctp: only drop the reference on the datamsg after sending a msg
If the chunks are enqueued successfully but sctp_cmd_interpreter()
return err to sctp_sendmsg() (mainly because of no mem), the chunks will
get re-queued, but we are dropping the reference and freeing them.

The fix is to just drop the reference on the datamsg just as it had
succeeded, as:
 - if the chunks weren't queued, this is enough to get them freed.
 - if they were queued, they will get freed when they finally get out or
 discarded.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 13:25:12 -05:00
lucien
69b5777f2e sctp: hold the chunks only after the chunk is enqueued in outq
When a msg is sent, sctp will hold the chunks of this msg and then try
to enqueue them. But if the chunks are not enqueued in sctp_outq_tail()
because of the invalid state, sctp_cmd_interpreter() may still return
success to sctp_sendmsg() after calling sctp_outq_flush(), these chunks
will become orphans and will leak.

So we fix them by moving sctp_chunk_hold() to sctp_outq_tail(), where we
are sure that the chunk is going to get queued.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-06 13:25:12 -05:00
Marcelo Ricardo Leitner
50a5ffb1ef sctp: also copy sk_tsflags when copying the socket
As we are keeping timestamps on when copying the socket, we also have to
copy sk_tsflags.

This is needed since b9f40e21ef ("net-timestamp: move timestamp flags
out of sk_flags").

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 22:23:22 -05:00
Marcelo Ricardo Leitner
01ce63c901 sctp: update the netstamp_needed counter when copying sockets
Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.

When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.

The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 22:23:22 -05:00
Marcelo Ricardo Leitner
cb5e173ed7 sctp: use the same clock as if sock source timestamps were on
SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
detecting stale cookies. This cookie is echoed back to the server by the
client and then that timestamp is checked.

Thing is, if the listening socket is using packet timestamping, the
cookie is encoded with ktime_get() value and checked against
ktime_get_real(), as done by __net_timestamp().

The fix is to sctp also use ktime_get_real(), so we can compare bananas
with bananas later no matter if packet timestamping was enabled or not.

Fixes: 52db882f3f ("net: sctp: migrate cookie life from timeval to ktime")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 22:23:22 -05:00
Jiri Pirko
b618aaa91b net: constify netif_is_* helpers net_device param
As suggested by Eric, these helpers should have const dev param.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 18:16:27 -05:00
Bjørn Mork
9a1ec4612c ipv6: keep existing flags when setting IFA_F_OPTIMISTIC
Commit 64236f3f3d ("ipv6: introduce IFA_F_STABLE_PRIVACY flag")
failed to update the setting of the IFA_F_OPTIMISTIC flag, causing
the IFA_F_STABLE_PRIVACY flag to be lost if IFA_F_OPTIMISTIC is set.

Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Fixes: 64236f3f3d ("ipv6: introduce IFA_F_STABLE_PRIVACY flag")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 18:09:46 -05:00
Andrew Lunn
a1a66b1100 batman-adv: Act on NETDEV_*_TYPE_CHANGE events
A network interface can change type. It may change from a type which
batman does not support, e.g. hdlc, to one it does, e.g. hdlc-eth.
When an interface changes type, it sends two notifications. Handle
these notifications.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 17:41:42 -05:00
Andrew Lunn
3ef0952ca8 ipv6: Only act upon NETDEV_*_TYPE_CHANGE if we have ipv6 addresses
An interface changing type may not have IPv6 addresses. Don't
call the address configuration type change in this case.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-05 17:41:42 -05:00
Nicolas Dichtel
6a61d4dbf4 gre6: allow to update all parameters via rtnl
Parameters were updated only if the kernel was unable to find the tunnel
with the new parameters, ie only if core pamareters were updated (keys,
addr, link, type).
Now it's possible to update ttl, hoplimit, flowinfo and flags.

Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04 16:52:06 -05:00
Johannes Berg
7d37fcd409 mac80211: reject zero cookie in mgmt-tx/roc cancel
When cancelling, you can cancel "any" (first in list) mgmt-tx
or remain-on-channel operation by using the value 0 for the
cookie along with the *opposite* operation, i.e.
 * cancel the first mgmt-tx by cancelling roc with 0 cookie
 * cancel the first roc by cancelling mgmt-tx with 0 cookie

This isn't really that bad since userspace should only pass
cookies that we gave it, but could lead to hard-to-debug
issues so better prevent it and reject zero values since we
never hand those out.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Jouni Malinen
c39b336deb mac80211: Allow a STA to join an IBSS with 80+80 MHz channel
While it was possible to create an IBSS with 80+80 MHz channel, joining
such an IBSS resulted in falling back to 20 MHz channel with VHT
disabled due to a missing switch case for 80+80.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Michal Sojka
1aeb135f84 cfg80211: reg: Refactor calculation of bandwidth flags
The same piece of code appears at two places. Make a function from it.

Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
aaa016ccd5 mac80211: rewrite remain-on-channel logic
Jouni found a bug in the remain-on-channel logic: when a short item
is queued, a long item is combined with it extending the original
one, and then the long item is deleted, the timeout doesn't go back
to the short one, and the short item ends up taking a long time. In
this case, this showed as blocking scan when running two test cases
back to back - the scan from the second was delayed even though all
the remain-on-channel items should long have been gone.

Fixing this with the current data structures turns out to be a bit
complicated, we just remove the long item from the dependents list
right now and don't recalculate the timeouts.

There's a somewhat similar bug where we delete the short item and
all the dependents go with it; to fix this we'd have to move them
from the dependents to the real list.

Instead of trying to do that, rewrite the code to not have all this
complexity in the data structures: use a single list and allow more
than one entry in it being marked as started. This makes the code a
bit more complex, the worker needs to understand that it might need
to just remove one of the started items, while keeping the device
off-channel, but that's not more complicated than the nested data
structures.

This then fixes both issues described, and makes it easier to also
limit the overall off-channel time when combining.

TODO: as before, with hardware remain-on-channel, deleting an item
after combining results in cancelling them all - we can keep track
of the time elapsed and only cancel after that to fix this.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
5ee00dbd52 mac80211: simplify ack_skb handling
Since the cookie is assigned inside ieee80211_make_ack_skb()
now, we no longer need to return the ack_skb as the cookie
and can simplify the function's return and the callers. Also
rename it to ieee80211_attach_ack_skb() to more accurately
reflect its purpose.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
a2fcfccbad mac80211: move off-channel/mgmt-tx code to offchannel.c
This is quite a bit of code that logically depends here since
it has to deal with all the remain-on-channel logic.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
e673a65952 mac80211: fix mgmt-tx abort cookie and leak
If a mgmt-tx operation is aborted before it runs, the wrong
cookie is reported back to userspace, and the ack_skb gets
leaked since the frame is freed directly instead of freeing
it using ieee80211_free_txskb(). Fix that.

Fixes: 3b79af973c ("mac80211: stop using pointers as userspace cookies")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
856142cdaa mac80211: catch queue stop underflow
If some code stops the queues more times than having started
(for when refcounting is used), warn on and reset the counter
to 0 to avoid blocking forever.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
63b4d8b373 mac80211: properly free TX skbs when monitor TX fails
We need to free all skbs here, not just the one we peeked
from the list.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
86c7ec9eb1 mac80211: properly free skb when r-o-c for TX fails
When freeing the TX skb for an off-channel TX, use the correct
API to also free the ACK skb that might have been allocated.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
90f9ba9b89 Revert "mac80211: don't advertise NL80211_FEATURE_FULL_AP_CLIENT_STATE"
This reverts commit 45bb780a21,
the previous two patches fixed the functionality.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
bda95eb1d1 cfg80211: handle add_station auth/assoc flag quirks
When a new station is added to AP/GO interfaces the default behaviour
is for it to be added authenticated and associated, due to backwards
compatibility. To prevent that, the driver must be able to do that
(setting the NL80211_FEATURE_FULL_AP_CLIENT_STATE feature flag) and
userspace must set the flag mask to auth|assoc and clear the set.

Handle this quirk in the API entirely in nl80211, and always push the
full flags to the drivers. NL80211_FEATURE_FULL_AP_CLIENT_STATE is
still required for userspace to be allowed to set the mask including
those bits, but after checking that add both flags to the mask and
set in case userspace didn't set them otherwise.

This obsoletes the mac80211 code handling this difference, no other
driver is currently using these flags.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Ayala Beker
a9bc31e418 cfg80211: use NL80211_ATTR_STA_AID in nl82011_set_station
Fix nl80211_set_station() to use the value of NL80211_ATTR_STA_AID
attribute instead of NL80211_ATTR_PEER_AID attribute.

Signed-off-by: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Vidyullatha Kanchanapally
91f123f20d mac80211: Add support for aborting an ongoing scan
This commit adds implementation for abort scan in mac80211.

Reviewed-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Vidyullatha Kanchanapally <vkanchan@qti.qualcomm.com>
Signed-off-by: Sunil Dutt <usdutt@qti.qualcomm.com>
[adjust to wdev change in previous patch and clean up code a bit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Vidyullatha Kanchanapally
91d3ab4673 cfg80211: Add support for aborting an ongoing scan
Implement new functionality for aborting an ongoing scan.

Add NL80211_CMD_ABORT_SCAN to the nl80211 interface. After
aborting the scan, driver shall provide the scan status by
calling cfg80211_scan_done().

Reviewed-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Vidyullatha Kanchanapally <vkanchan@qti.qualcomm.com>
Signed-off-by: Sunil Dutt <usdutt@qti.qualcomm.com>
[change command to take wdev instead of netdev so that it
 can be used on p2p-device scans]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Janusz.Dziedzic@tieto.com
b115b97299 mac80211: add new IEEE80211_VIF_GET_NOA_UPDATE flag
Add new VIF flag, that will allow get NOA update
notification when driver will request this, even
this is not pure P2P vif (eg. STA vif).

Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Michal Sojka
c781944b71 cfg80211: Remove unused cfg80211_can_use_iftype_chan()
Last caller of this function was removed in 3.17 in commit
97dc94f1d9.

Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Michal Sojka
491728746b cfg80211: reg: Remove unused function parameter
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Eliad Peller
ef044763a3 mac80211: add atomic uploaded keys iterator
add ieee80211_iter_keys_rcu() to iterate over uploaded
keys in atomic context (when rcu is locked)

The station removal code removes the keys only after
calling synchronize_net(), so it's not safe to iterate
the keys at this point (and postponing the actual key
deletion with call_rcu() might result in some
badly-ordered ops calls).

Add a flag to indicate a station is being removed,
and skip the configured keys if it's set.

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Emmanuel Grumbach
0ead2510f8 mac80211: allow the driver to send EOSP when needed
This can happen when the driver needs to send less frames
than expected and then needs to close the SP.
Mac80211 still needs to set the more_data properly based
on its buffer state (ps_tx_buffer and buffered frames on
other TIDs).
To that end, refactor the code that delivers frames upon
uAPSD trigger frames to be able to get only the more_data
bit without actually delivering those frames in case the
driver is just asking to set a NDP with EOSP and MORE_DATA
bit properly set.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Ola Olsson
1b9df2d20e cfg80211: ocb: Fix null pointer deref if join_ocb is unimplemented
Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
441275e103 mac80211: remove string from unaligned packet warning
This really should never happen except very early in the process
of bringing up a new driver, at which point you'll have to add
more debugging in the driver and this string isn't useful. Remove
it and save some size (when it's even compiled in.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
996bf99c71 lib80211: ratelimit key index mismatch
This indicates a driver key selection issue, but even then there's
no point in printing it all the time, so ratelimit it. Also remove
the priv pointer from it -- people debugging will only have a single
device anyway and it's useless as anything but a cookie.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
d671b2a077 mac80211: mesh: print MAC address instead of pointer
There's no point in printing the mpath pointer since it can't
be used for anything - print the MAC address instead (like in
the forwarding case.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
0483eeac59 cfg80211: replace ieee80211_ie_split() with an inline
The function is a very simple wrapper around another one,
just adds a few default parameters, so replace it with a
static inline instead of using EXPORT_SYMBOL, reducing
the module size slightly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
6e045905d1 cfg80211: add complete data to station add/change tracing
Complete the tracepoint with the missing data - it's not printed
by default (a lot of it is dynamic arrays) but will be recorded
and be available during post-processing.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Ilan Peer
a1056b1baa cfg80211: Add missing tracing to cfg80211
Add missing tracing for:

1. start_radar_detection()
2. set_mcast_rates()
3. set_coalesce()

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
3110489117 mac80211: allow driver to prevent two stations w/ same address
Some devices or drivers cannot deal with having the same station
address for different virtual interfaces, say as a client to two
virtual AP interfaces. Rather than requiring each driver with a
limitation like that to enforce it, add a hardware flag for it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:32 +01:00
Johannes Berg
d9d3ac7afd Merge remote-tracking branch 'mac80211/master' into HEAD
I want to get the full off-channel bugfix since later code depends on
it, as well as the AP client state change so I can revert it correctly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-04 14:43:05 +01:00
David S. Miller
f188b951f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/renesas/ravb_main.c
	kernel/bpf/syscall.c
	net/ipv4/ipmr.c

All three conflicts were cases of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 21:09:12 -05:00
Linus Torvalds
071f5d105a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A lot of Thanksgiving turkey leftovers accumulated, here goes:

   1) Fix bluetooth l2cap_chan object leak, from Johan Hedberg.

   2) IDs for some new iwlwifi chips, from Oren Givon.

   3) Fix rtlwifi lockups on boot, from Larry Finger.

   4) Fix memory leak in fm10k, from Stephen Hemminger.

   5) We have a route leak in the ipv6 tunnel infrastructure, fix from
      Paolo Abeni.

   6) Fix buffer pointer handling in arm64 bpf JIT,f rom Zi Shen Lim.

   7) Wrong lockdep annotations in tcp md5 support, fix from Eric
      Dumazet.

   8) Work around some middle boxes which prevent proper handling of TCP
      Fast Open, from Yuchung Cheng.

   9) TCP repair can do huge kmalloc() requests, build paged SKBs
      instead.  From Eric Dumazet.

  10) Fix msg_controllen overflow in scm_detach_fds, from Daniel
      Borkmann.

  11) Fix device leaks on ipmr table destruction in ipv4 and ipv6, from
      Nikolay Aleksandrov.

  12) Fix use after free in epoll with AF_UNIX sockets, from Rainer
      Weikusat.

  13) Fix double free in VRF code, from Nikolay Aleksandrov.

  14) Fix skb leaks on socket receive queue in tipc, from Ying Xue.

  15) Fix ifup/ifdown crach in xgene driver, from Iyappan Subramanian.

  16) Fix clearing of persistent array maps in bpf, from Daniel
      Borkmann.

  17) In TCP, for the cross-SYN case, we don't initialize tp->copied_seq
      early enough.  From Eric Dumazet.

  18) Fix out of bounds accesses in bpf array implementation when
      updating elements, from Daniel Borkmann.

  19) Fill gaps in RCU protection of np->opt in ipv6 stack, from Eric
      Dumazet.

  20) When dumping proxy neigh entries, we have to accomodate NULL
      device pointers properly, from Konstantin Khlebnikov.

  21) SCTP doesn't release all ipv6 socket resources properly, fix from
      Eric Dumazet.

  22) Prevent underflows of sch->q.qlen for multiqueue packet
      schedulers, also from Eric Dumazet.

  23) Fix MAC and unicast list handling in bnxt_en driver, from Jeffrey
      Huang and Michael Chan.

  24) Don't actively scan radar channels, from Antonio Quartulli"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (110 commits)
  net: phy: reset only targeted phy
  bnxt_en: Setup uc_list mac filters after resetting the chip.
  bnxt_en: enforce proper storing of MAC address
  bnxt_en: Fixed incorrect implementation of ndo_set_mac_address
  net: lpc_eth: remove irq > NR_IRQS check from probe()
  net_sched: fix qdisc_tree_decrease_qlen() races
  openvswitch: fix hangup on vxlan/gre/geneve device deletion
  ipv4: igmp: Allow removing groups from a removed interface
  ipv6: sctp: implement sctp_v6_destroy_sock()
  arm64: bpf: add 'store immediate' instruction
  ipv6: kill sk_dst_lock
  ipv6: sctp: add rcu protection around np->opt
  net/neighbour: fix crash at dumping device-agnostic proxy entries
  sctp: use GFP_USER for user-controlled kmalloc
  sctp: convert sack_needed and sack_generation to bits
  ipv6: add complete rcu protection around np->opt
  bpf: fix allocation warnings in bpf maps and integer overflow
  mvebu: dts: enable IP checksum with jumbo frames for Armada 38x on Port0
  net: mvneta: enable setting custom TX IP checksum limit
  net: mvneta: fix error path for building skb
  ...
2015-12-03 16:02:46 -08:00
David S. Miller
e3c9b1ef78 A small set of fixes for 4.4:
* fix scanning in mac80211 to not actively scan radar
    channels (from Antonio)
  * fix uninitialized variable in remain-on-channel that
    could lead to treating frame TX as remain-on-channel
    and not sending the frame at all
  * remove NL80211_FEATURE_FULL_AP_CLIENT_STATE again, it
    was broken and needs more work, we'll enable it later
  * fix call_rcu() induced use-after-reset/free in mesh
    (that was suddenly causing issues in certain tests)
  * always request block-ack window size 64 as we found
    some APs will otherwise crash (really ...)
  * fix P2P-Device teardown sequence to avoid restarting
    with uninitialized data
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJWX2JpAAoJEGt7eEactAAd+9cQAJmn3zt0orj/sASv7BeF0h5d
 sRfAhkBVOTZur8MgVj1c7fNzT3h1HYNei6c4SA2+rphy6Vbifoli1nLNloC+1Ld4
 2WXllEqVe473GqVofCxZHsYZPr2Inmhj7uMiDqvoKUiSRz7phmkY9m+Vju6WZG/W
 F6FrTLqFS7UDHIYNYH1DNVSScd/89Gu6pHZvEpoHkrsvt5rEEZiPAQ7sDB4MAMSm
 amETtuBqgX83gHR2G4UT2Z9r8TVdzhO+s7vvdVjj0qbP6C6BaS9IUXDjmm3gOvHy
 G7j9MJuUC8w/2fZ5A5/l94OuN5rF/ZFMNkn2e6OIzg0HjEZh74CeLl21CnuxdNpB
 ECmDVbKoI3OVoFbhEl7P5fBokzZsqhAXpZOmbYEeFRyO6lF2Mv9uzttsF6EOmCX0
 BjIoEXOWA2o6IUD/8M6NjW+/B58SDDVi9Mg6D+7Dn7rUFlQ4pddjb0m94bI8GQQU
 wl7gROMvYR3tIhiMs1bLF9jJgA831WGWu9eiq8mT2kHPaEV2bFO7OK+SUxyZu1M7
 UhN4eoLpU84v9QNJ34N8RCiYxEZ1e6HQxBwQn/fDIWOjOHryZoArhicFY9aOEja4
 9xBI9OJhBWOL4N4AFdmTExBdYudSgCTpX+/gQ4tSfedz3lqF79y8+PILwv6E1Q6D
 8pScH/4pVo4v5omGaMpA
 =vui7
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2015-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A small set of fixes for 4.4:
 * fix scanning in mac80211 to not actively scan radar
   channels (from Antonio)
 * fix uninitialized variable in remain-on-channel that
   could lead to treating frame TX as remain-on-channel
   and not sending the frame at all
 * remove NL80211_FEATURE_FULL_AP_CLIENT_STATE again, it
   was broken and needs more work, we'll enable it later
 * fix call_rcu() induced use-after-reset/free in mesh
   (that was suddenly causing issues in certain tests)
 * always request block-ack window size 64 as we found
   some APs will otherwise crash (really ...)
 * fix P2P-Device teardown sequence to avoid restarting
   with uninitialized data
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:56:22 -05:00
Jon Paul Maloy
dc8d1eb305 tipc: fix node reference count bug
Commit 5405ff6e15 ("tipc: convert node lock to rwlock")
introduced a bug to the node reference counter handling. When a
message is successfully sent in the function tipc_node_xmit(),
we return directly after releasing the node lock, instead of
continuing and decrementing the node reference counter as we
should do.

This commit fixes this bug.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:19:40 -05:00
Guillaume Nault
681b4d88ad pppox: use standard module auto-loading feature
* Register PF_PPPOX with pppox module rather than with pppoe,
    so that pppoe doesn't get loaded for any PF_PPPOX socket.

  * Register PX_PROTO_* with standard MODULE_ALIAS_NET_PF_PROTO()
    instead of using pppox's own naming scheme.

  * While there, add auto-loading feature for pptp.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:12:54 -05:00
Asias He
8a2a202989 VSOCK: Add Makefile and Kconfig
Enable virtio-vsock and vhost-vsock.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:05:55 -05:00
Asias He
32e61b06b6 VSOCK: Introduce virtio-vsock.ko
VM sockets virtio transport implementation. This module runs in guest
kernel.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:05:55 -05:00
Asias He
80a19e338d VSOCK: Introduce virtio-vsock-common.ko
This module contains the common code and header files for the following
virtio-vsock and virtio-vhost kernel modules.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:05:55 -05:00
Asias He
357ab2234d VSOCK: Introduce vsock_find_unbound_socket and vsock_bind_dgram_generic
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:05:54 -05:00
Roopa Prabhu
c89359a42e mpls: support for dead routes
Adds support for RTNH_F_DEAD and RTNH_F_LINKDOWN flags on mpls
routes due to link events. Also adds code to ignore dead
routes during route selection.

Unlike ip routes, mpls routes are not deleted when the route goes
dead. This is current mpls behaviour and this patch does not change
that. With this patch however, routes will be marked dead.
dead routes are not notified to userspace (this is consistent with ipv4
routes).

dead routes:
-----------
$ip -f mpls route show
100
    nexthop as to 200 via inet 10.1.1.2  dev swp1
    nexthop as to 700 via inet 10.1.1.6  dev swp2

$ip link set dev swp1 down

$ip link show dev swp1
4: swp1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode
DEFAULT group default qlen 1000
    link/ether 00:02:00:00:00:01 brd ff:ff:ff:ff:ff:ff

$ip -f mpls route show
100
    nexthop as to 200 via inet 10.1.1.2  dev swp1 dead linkdown
    nexthop as to 700 via inet 10.1.1.6  dev swp2

linkdown routes:
----------------
$ip -f mpls route show
100
    nexthop as to 200 via inet 10.1.1.2  dev swp1
    nexthop as to 700 via inet 10.1.1.6  dev swp2

$ip link show dev swp1
4: swp1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 1000
    link/ether 00:02:00:00:00:01 brd ff:ff:ff:ff:ff:ff

/* carrier goes down */
$ip link show dev swp1
4: swp1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:02:00:00:00:01 brd ff:ff:ff:ff:ff:ff

$ip -f mpls route show
100
    nexthop as to 200 via inet 10.1.1.2  dev swp1 linkdown
    nexthop as to 700 via inet 10.1.1.6  dev swp2

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:03:27 -05:00
Eric Dumazet
4eaf3b84f2 net_sched: fix qdisc_tree_decrease_qlen() races
qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
devices.

One problem is that it updates sch->q.qlen and sch->qstats.drops
on the mq/mqprio root qdisc, while it should not : Daniele
reported underflows errors :
[  681.774821] PAX: sch->q.qlen: 0 n: 1
[  681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
[  681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G           O    4.2.6.201511282239-1-grsec #1
[  681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
[  681.774956]  ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
[  681.774959]  ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
[  681.774960]  ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
[  681.774962] Call Trace:
[  681.774967]  [<ffffffffa95d2810>] dump_stack+0x4c/0x7f
[  681.774970]  [<ffffffffa91a44f4>] report_size_overflow+0x34/0x50
[  681.774972]  [<ffffffffa94d17e2>] qdisc_tree_decrease_qlen+0x152/0x160
[  681.774976]  [<ffffffffc02694b1>] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
[  681.774978]  [<ffffffffc02680a0>] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
[  681.774980]  [<ffffffffa94cd92d>] __qdisc_run+0x4d/0x1d0
[  681.774983]  [<ffffffffa949b2b2>] net_tx_action+0xc2/0x160
[  681.774985]  [<ffffffffa90664c1>] __do_softirq+0xf1/0x200
[  681.774987]  [<ffffffffa90665ee>] run_ksoftirqd+0x1e/0x30
[  681.774989]  [<ffffffffa90896b0>] smpboot_thread_fn+0x150/0x260
[  681.774991]  [<ffffffffa9089560>] ? sort_range+0x40/0x40
[  681.774992]  [<ffffffffa9085fe4>] kthread+0xe4/0x100
[  681.774994]  [<ffffffffa9085f00>] ? kthread_worker_fn+0x170/0x170
[  681.774995]  [<ffffffffa95d8d1e>] ret_from_fork+0x3e/0x70

mq/mqprio have their own ways to report qlen/drops by folding stats on
all their queues, with appropriate locking.

A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
without proper locking : concurrent qdisc updates could corrupt the list
that qdisc_match_from_root() parses to find a qdisc given its handle.

Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
qdisc_tree_decrease_qlen() can use to abort its tree traversal,
as soon as it meets a mq/mqprio qdisc children.

Second problem can be fixed by RCU protection.
Qdisc are already freed after RCU grace period, so qdisc_list_add() and
qdisc_list_del() simply have to use appropriate rcu list variants.

A future patch will add a per struct netdev_queue list anchor, so that
qdisc_tree_decrease_qlen() can have more efficient lookups.

Reported-by: Daniele Fucini <dfucini@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <cwang@twopensource.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 14:59:05 -05:00
Phil Sutter
d6df198d92 net: ipv6: restrict hop_limit sysctl setting to range [1; 255]
Setting a value bigger than 255 resulted in using only the lower eight
bits of that value as it is assigned to the u8 header field. To avoid
this unexpected result, reject such values.

Setting a value of zero is technically possible, but hosts receiving
such a packet have to treat it like hop_limit was set to one, according
to RFC2460. Therefore I don't see a use-case for that.

Setting a route's hop_limit to zero in iproute2 means to use the sysctl
default, which is not the case here: Setting e.g.
net.conf.eth0.hop_limit=0 will not make the kernel use
net.conf.all.hop_limit for outgoing packets on eth0. To avoid these
kinds of confusion, reject zero.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 14:42:10 -05:00
Paolo Abeni
1317530302 openvswitch: fix hangup on vxlan/gre/geneve device deletion
Each openvswitch tunnel vport (vxlan,gre,geneve) holds a reference
to the underlying tunnel device, but never released it when such
device is deleted.
Deleting the underlying device via the ip tool cause the kernel to
hangup in the netdev_wait_allrefs() loop.
This commit ensure that on device unregistration dp_detach_port_notify()
is called for all vports that hold the device reference, properly
releasing it.

Fixes: 614732eaa1 ("openvswitch: Use regular VXLAN net_device device")
Fixes: b2acd1dc39 ("openvswitch: Use regular GRE net_device instead of vport")
Fixes: 6b001e682e ("openvswitch: Use Geneve device.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 14:29:25 -05:00
Andrew Lunn
4eba7bb1d7 ipv4: igmp: Allow removing groups from a removed interface
When a multicast group is joined on a socket, a struct ip_mc_socklist
is appended to the sockets mc_list containing information about the
joined group.

If the interface is hot unplugged, this entry becomes stale. Prior to
commit 52ad353a53 ("igmp: fix the problem when mc leave group") it
was possible to remove the stale entry by performing a
IP_DROP_MEMBERSHIP, passing either the old ifindex or ip address on
the interface. However, this fix enforces that the interface must
still exist. Thus with time, the number of stale entries grows, until
sysctl_igmp_max_memberships is reached and then it is not possible to
join and more groups.

The previous patch fixes an issue where a IP_DROP_MEMBERSHIP is
performed without specifying the interface, either by ifindex or ip
address. However here we do supply one of these. So loosen the
restriction on device existence to only apply when the interface has
not been specified. This then restores the ability to clean up the
stale entries.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 52ad353a53 "(igmp: fix the problem when mc leave group")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:07:05 -05:00
Eric Dumazet
602dd62dfb ipv6: sctp: implement sctp_v6_destroy_sock()
Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets.

We need to call inet6_destroy_sock() to properly release
inet6 specific fields.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:05:57 -05:00
David S. Miller
79aecc7216 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:

====================
pull request: bluetooth 2015-12-01

Here's a Bluetooth fix for the 4.4-rc series that fixes a memory leak of
the Security Manager L2CAP channel that'll happen for every LE
connection.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 12:04:05 -05:00
Jiri Pirko
04d482660a net: introduce change lower state notifier
When lower device like bonding slave, team/bridge port, etc changes its
state, it is useful for others to notice this change. Currently this is
implemented specificly for bonding as NETDEV_BONDING_INFO notifier. This
patch aims to replace this specific usage and make this more generic to
be used for all upper-lower devices.

Introduce NETDEV_CHANGELOWERSTATE netdev notifier type and
netdev_lower_state_changed() helper.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:26 -05:00
Jiri Pirko
29bf24afb2 net: add possibility to pass information about upper device via notifier
Sometimes the drivers and other code would find it handy to know some
internal information about upper device being changed. So allow upper-code
to pass information down to notifier listeners during linking.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:25 -05:00
Jiri Pirko
6dffb0447c net: propagate upper priv via netdev_master_upper_dev_link
Eliminate netdev_master_upper_dev_link_private and pass priv directly as
a parameter of netdev_master_upper_dev_link.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:25 -05:00
Ido Schimmel
b03804e7c3 net: Check CHANGEUPPER notifier return value
switchdev drivers reflect the newly requested topology to hardware when
CHANGEUPPER is received, after software links were already formed.
However, the operation can fail and user will not be notified, as the
return value of the notifier is not checked.

Add this check and rollback software links if necessary.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:49:23 -05:00
Eric Dumazet
6bd4f355df ipv6: kill sk_dst_lock
While testing the np->opt RCU conversion, I found that UDP/IPv6 was
using a mixture of xchg() and sk_dst_lock to protect concurrent changes
to sk->sk_dst_cache, leading to possible corruptions and crashes.

ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
way to fix the mess is to remove sk_dst_lock completely, as we did for
IPv4.

__ip6_dst_store() and ip6_dst_store() share same implementation.

sk_setup_caps() being called with socket lock being held or not,
we have to use sk_dst_set() instead of __sk_dst_set()

Note that I had to move the "np->dst_cookie = rt6_get_cookie(rt);"
in ip6_dst_store() before the sk_setup_caps(sk, dst) call.

This is because ip6_dst_store() can be called from process context,
without any lock held.

As soon as the dst is installed in sk->sk_dst_cache, dst can be freed
from another cpu doing a concurrent ip6_dst_store()

Doing the dst dereference before doing the install is needed to make
sure no use after free would trigger.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:32:06 -05:00
Eric Dumazet
c836a8ba93 ipv6: sctp: add rcu protection around np->opt
This patch completes the work I did in commit 45f6fad84c
("ipv6: add complete rcu protection around np->opt"), as I missed
sctp part.

This simply makes sure np->opt is used with proper RCU locking
and accessors.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 11:30:58 -05:00
Tejun Heo
1f7dd3e5a6 cgroup: fix handling of multi-destination migration from subtree_control enabling
Consider the following v2 hierarchy.

  P0 (+memory) --- P1 (-memory) --- A
                                 \- B
       
P0 has memory enabled in its subtree_control while P1 doesn't.  If
both A and B contain processes, they would belong to the memory css of
P1.  Now if memory is enabled on P1's subtree_control, memory csses
should be created on both A and B and A's processes should be moved to
the former and B's processes the latter.  IOW, enabling controllers
can cause atomic migrations into different csses.

The core cgroup migration logic has been updated accordingly but the
controller migration methods haven't and still assume that all tasks
migrate to a single target css; furthermore, the methods were fed the
css in which subtree_control was updated which is the parent of the
target csses.  pids controller depends on the migration methods to
move charges and this made the controller attribute charges to the
wrong csses often triggering the following warning by driving a
counter negative.

 WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 pids_cancel.constprop.6+0x31/0x40()
 Modules linked in:
 CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #29
 ...
  ffffffff81f65382 ffff88007c043b90 ffffffff81551ffc 0000000000000000
  ffff88007c043bc8 ffffffff810de202 ffff88007a752000 ffff88007a29ab00
  ffff88007c043c80 ffff88007a1d8400 0000000000000001 ffff88007c043bd8
 Call Trace:
  [<ffffffff81551ffc>] dump_stack+0x4e/0x82
  [<ffffffff810de202>] warn_slowpath_common+0x82/0xc0
  [<ffffffff810de2fa>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8118e031>] pids_cancel.constprop.6+0x31/0x40
  [<ffffffff8118e0fd>] pids_can_attach+0x6d/0xf0
  [<ffffffff81188a4c>] cgroup_taskset_migrate+0x6c/0x330
  [<ffffffff81188e05>] cgroup_migrate+0xf5/0x190
  [<ffffffff81189016>] cgroup_attach_task+0x176/0x200
  [<ffffffff8118949d>] __cgroup_procs_write+0x2ad/0x460
  [<ffffffff81189684>] cgroup_procs_write+0x14/0x20
  [<ffffffff811854e5>] cgroup_file_write+0x35/0x1c0
  [<ffffffff812e26f1>] kernfs_fop_write+0x141/0x190
  [<ffffffff81265f88>] __vfs_write+0x28/0xe0
  [<ffffffff812666fc>] vfs_write+0xac/0x1a0
  [<ffffffff81267019>] SyS_write+0x49/0xb0
  [<ffffffff81bcef32>] entry_SYSCALL_64_fastpath+0x12/0x76

This patch fixes the bug by removing @css parameter from the three
migration methods, ->can_attach, ->cancel_attach() and ->attach() and
updating cgroup_taskset iteration helpers also return the destination
css in addition to the task being migrated.  All controllers are
updated accordingly.

* Controllers which don't care whether there are one or multiple
  target csses can be converted trivially.  cpu, io, freezer, perf,
  netclassid and netprio fall in this category.

* cpuset's current implementation assumes that there's single source
  and destination and thus doesn't support v2 hierarchy already.  The
  only change made by this patchset is how that single destination css
  is obtained.

* memory migration path already doesn't do anything on v2.  How the
  single destination css is obtained is updated and the prep stage of
  mem_cgroup_can_attach() is reordered to accomodate the change.

* pids is the only controller which was affected by this bug.  It now
  correctly handles multi-destination migrations and no longer causes
  counter underflow from incorrect accounting.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Aleksa Sarai <cyphar@cyphar.com>
2015-12-03 10:18:21 -05:00
Konstantin Khlebnikov
6adc5fd6a1 net/neighbour: fix crash at dumping device-agnostic proxy entries
Proxy entries could have null pointer to net-device.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 84920c1420 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 00:07:51 -05:00
Eric Dumazet
7450aaf61f tcp: suppress too verbose messages in tcp_send_ack()
If tcp_send_ack() can not allocate skb, we properly handle this
and setup a timer to try later.

Use __GFP_NOWARN to avoid polluting syslog in the case host is
under memory pressure, so that pertinent messages are not lost under
a flood of useless information.

sk_gfp_atomic() can use its gfp_mask argument (all callers currently
were using GFP_ATOMIC before this patch)

We rename sk_gfp_atomic() to sk_gfp_mask() to clearly express this
function now takes into account its second argument (gfp_mask)

Note that when tcp_transmit_skb() is called with clone_it set to false,
we do not attempt memory allocations, so can pass a 0 gfp_mask, which
most compilers can emit faster than a non zero or constant value.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 23:44:32 -05:00
Marcelo Ricardo Leitner
cacc062152 sctp: use GFP_USER for user-controlled kmalloc
Dmitry Vyukov reported that the user could trigger a kernel warning by
using a large len value for getsockopt SCTP_GET_LOCAL_ADDRS, as that
value directly affects the value used as a kmalloc() parameter.

This patch thus switches the allocation flags from all user-controllable
kmalloc size to GFP_USER to put some more restrictions on it and also
disables the warn, as they are not necessary.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 23:39:46 -05:00
Eric Dumazet
45f6fad84c ipv6: add complete rcu protection around np->opt
This patch addresses multiple problems :

UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
while socket is not locked : Other threads can change np->opt
concurrently. Dmitry posted a syzkaller
(http://github.com/google/syzkaller) program desmonstrating
use-after-free.

Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
and dccp_v6_request_recv_sock() also need to use RCU protection
to dereference np->opt once (before calling ipv6_dup_options())

This patch adds full RCU protection to np->opt

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 23:37:16 -05:00
Johannes Berg
c1df932c05 mac80211: fix off-channel mgmt-tx uninitialized variable usage
In the last change here, I neglected to update the cookie in one code
path: when a mgmt-tx has no real cookie sent to userspace as it doesn't
wait for a response, but is off-channel. The original code used the SKB
pointer as the cookie and always assigned the cookie to the TX SKB in
ieee80211_start_roc_work(), but my change turned this around and made
the code rely on a valid cookie being passed in.

Unfortunately, the off-channel no-wait TX path wasn't assigning one at
all, resulting in an uninitialized stack value being used. This wasn't
handed back to userspace as a cookie (since in the no-wait case there
isn't a cookie), but it was tested for non-zero to distinguish between
mgmt-tx and off-channel.

Fix this by assigning a dummy non-zero cookie unconditionally, and get
rid of a misleading comment and some dead code while at it. I'll clean
up the ACK SKB handling separately later.

Fixes: 3b79af973c ("mac80211: stop using pointers as userspace cookies")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-02 22:27:53 +01:00
Antonio Quartulli
4e39ccac0d mac80211: do not actively scan DFS channels
DFS channels should not be actively scanned as we can't be sure
if we are allowed or not.

If the current channel is in the DFS band, active scan might be
performed after CSA, but we have no guarantee about other channels,
therefore it is safer to prevent active scanning at all.

Cc: stable@vger.kernel.org
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-02 22:27:53 +01:00
Eliad Peller
835112b289 mac80211: don't teardown sdata on sdata stop
Interfaces are being initialized (setup) on addition,
and torn down on removal.

However, p2p device is being torn down when stopped,
resulting in the next p2p start operation being done
on uninitialized interface.

Solve it by calling ieee80211_teardown_sdata() only
on interface removal (for the non-netdev case).

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
[squashed in fix to call teardown after unregister]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-02 22:27:27 +01:00
Paolo Abeni
83e4bf7a74 openvswitch: properly refcount vport-vxlan module
After 614732eaa1, no refcount is maintained for the vport-vxlan module.
This allows the userspace to remove such module while vport-vxlan
devices still exist, which leads to later oops.

v1 -> v2:
 - move vport 'owner' initialization in ovs_vport_ops_register()
   and make such function a macro

Fixes: 614732eaa1 ("openvswitch: Use regular VXLAN net_device device")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02 11:50:59 -05:00
Eric Dumazet
ceb5d58b21 net: fix sock_wake_async() rcu protection
Dmitry provided a syzkaller (http://github.com/google/syzkaller)
triggering a fault in sock_wake_async() when async IO is requested.

Said program stressed af_unix sockets, but the issue is generic
and should be addressed in core networking stack.

The problem is that by the time sock_wake_async() is called,
we should not access the @flags field of 'struct socket',
as the inode containing this socket might be freed without
further notice, and without RCU grace period.

We already maintain an RCU protected structure, "struct socket_wq"
so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
is the safe route.

It also reduces number of cache lines needing dirtying, so might
provide a performance improvement anyway.

In followup patches, we might move remaining flags (SOCK_NOSPACE,
SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
being mostly read and let it being shared between cpus.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:45:05 -05:00
Eric Dumazet
9cd3e072b0 net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
This patch is a cleanup to make following patch easier to
review.

Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
from (struct socket)->flags to a (struct socket_wq)->flags
to benefit from RCU protection in sock_wake_async()

To ease backports, we rename both constants.

Two new helpers, sk_set_bit(int nr, struct sock *sk)
and sk_clear_bit(int net, struct sock *sk) are added so that
following patch can change their implementation.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-01 15:45:05 -05:00