Commit graph

632829 commits

Author SHA1 Message Date
Eric Dumazet
82454581d7 tcp: do not export sysctl_tcp_low_latency
Since commit b2fb4f54ec ("tcp: uninline tcp_prequeue()") we no longer
access sysctl_tcp_low_latency from a module.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19 11:12:41 -04:00
Jiri Pirko
85dda4e5b0 rtnetlink: Add rtnexthop offload flag to compare mask
The offload flag is a status flag and should not be used by
FIB semantics for comparison.

Fixes: 37ed949369 ("rtnetlink: add RTNH_F_EXTERNAL flag for fib offload")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19 11:07:57 -04:00
Ido Schimmel
97c242902c switchdev: Execute bridge ndos only for bridge ports
We recently got the following warning after setting up a vlan device on
top of an offloaded bridge and executing 'bridge link':

WARNING: CPU: 0 PID: 18566 at drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:81 mlxsw_sp_port_orig_get.part.9+0x55/0x70 [mlxsw_spectrum]
[...]
 CPU: 0 PID: 18566 Comm: bridge Not tainted 4.8.0-rc7 #1
 Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
  0000000000000286 00000000e64ab94f ffff880406e6f8f0 ffffffff8135eaa3
  0000000000000000 0000000000000000 ffff880406e6f930 ffffffff8108c43b
  0000005106e6f988 ffff8803df398840 ffff880403c60108 ffff880406e6f990
 Call Trace:
  [<ffffffff8135eaa3>] dump_stack+0x63/0x90
  [<ffffffff8108c43b>] __warn+0xcb/0xf0
  [<ffffffff8108c56d>] warn_slowpath_null+0x1d/0x20
  [<ffffffffa01420d5>] mlxsw_sp_port_orig_get.part.9+0x55/0x70 [mlxsw_spectrum]
  [<ffffffffa0142195>] mlxsw_sp_port_attr_get+0xa5/0xb0 [mlxsw_spectrum]
  [<ffffffff816f151f>] switchdev_port_attr_get+0x4f/0x140
  [<ffffffff816f15d0>] switchdev_port_attr_get+0x100/0x140
  [<ffffffff816f15d0>] switchdev_port_attr_get+0x100/0x140
  [<ffffffff816f1d6b>] switchdev_port_bridge_getlink+0x5b/0xc0
  [<ffffffff816f2680>] ? switchdev_port_fdb_dump+0x90/0x90
  [<ffffffff815f5427>] rtnl_bridge_getlink+0xe7/0x190
  [<ffffffff8161a1b2>] netlink_dump+0x122/0x290
  [<ffffffff8161b0df>] __netlink_dump_start+0x15f/0x190
  [<ffffffff815f5340>] ? rtnl_bridge_dellink+0x230/0x230
  [<ffffffff815fab46>] rtnetlink_rcv_msg+0x1a6/0x220
  [<ffffffff81208118>] ? __kmalloc_node_track_caller+0x208/0x2c0
  [<ffffffff815f5340>] ? rtnl_bridge_dellink+0x230/0x230
  [<ffffffff815fa9a0>] ? rtnl_newlink+0x890/0x890
  [<ffffffff8161cf54>] netlink_rcv_skb+0xa4/0xc0
  [<ffffffff815f56f8>] rtnetlink_rcv+0x28/0x30
  [<ffffffff8161c92c>] netlink_unicast+0x18c/0x240
  [<ffffffff8161ccdb>] netlink_sendmsg+0x2fb/0x3a0
  [<ffffffff815c5a48>] sock_sendmsg+0x38/0x50
  [<ffffffff815c6031>] SYSC_sendto+0x101/0x190
  [<ffffffff815c7111>] ? __sys_recvmsg+0x51/0x90
  [<ffffffff815c6b6e>] SyS_sendto+0xe/0x10
  [<ffffffff817017f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4

The problem is that the 8021q module propagates the call to
ndo_bridge_getlink() via switchdev ops, but the switch driver doesn't
recognize the netdev, as it's not offloaded.

While we can ignore calls being made to non-bridge ports inside the
driver, a better fix would be to push this check up to the switchdev
layer.

Note that these ndos can be called for non-bridged netdev, but this only
happens in certain PF drivers which don't call the corresponding
switchdev functions anyway.

Fixes: 99f44bb352 ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Tested-by: Tamir Winetroub <tamirw@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19 10:58:04 -04:00
Ido Schimmel
e4961b0768 net: core: Correctly iterate over lower adjacency list
Tamir reported the following trace when processing ARP requests received
via a vlan device on top of a VLAN-aware bridge:

 NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
[...]
 CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.8.0-rc7 #1
 Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
 task: ffff88017edfea40 task.stack: ffff88017ee10000
 RIP: 0010:[<ffffffff815dcc73>]  [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
[...]
 Call Trace:
  <IRQ>
  [<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
  [<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
  [<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
  [<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
  [<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
  [<ffffffff815f2eb6>] neigh_update+0x306/0x740
  [<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
  [<ffffffff8165ea3f>] arp_process+0x66f/0x700
  [<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
  [<ffffffff8165ec29>] arp_rcv+0x139/0x1d0
  [<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
  [<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
  [<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
  [<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
  [<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
  [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
  [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
  [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
  [<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
  [<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
  [<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
  [<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
  [<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
  [<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
  [<ffffffff81374815>] ? find_next_bit+0x15/0x20
  [<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
  [<ffffffff810c9968>] ? load_balance+0x178/0x9b0
  [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
  [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
  [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
  [<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
  [<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
  [<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
  [<ffffffff81091986>] tasklet_action+0xf6/0x110
  [<ffffffff81704556>] __do_softirq+0xf6/0x280
  [<ffffffff8109213f>] irq_exit+0xdf/0xf0
  [<ffffffff817042b4>] do_IRQ+0x54/0xd0
  [<ffffffff8170214c>] common_interrupt+0x8c/0x8c

The problem is that netdev_all_lower_get_next_rcu() never advances the
iterator, thereby causing the loop over the lower adjacency list to run
forever.

Fix this by advancing the iterator and avoid the infinite loop.

Fixes: 7ce856aaaf ("mlxsw: spectrum: Add couple of lower device helper functions")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19 10:38:08 -04:00
Eric Garver
3805a938a6 flow_dissector: Check skb for VLAN only if skb specified.
Fixes a panic when calling eth_get_headlen(). Noticed on i40e driver.

Fixes: d5709f7ab7 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
Signed-off-by: Eric Garver <e@erig.me>
Reviewed-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19 10:35:46 -04:00
Wei Yongjun
b4f0fd4baa qed: Use list_move_tail instead of list_del/list_add_tail
Using list_move_tail() instead of list_del() + list_add_tail().

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 16:40:41 -04:00
Arnd Bergmann
ecf244f753 rocker: fix maybe-uninitialized warning
In some rare configurations, we get a warning about the 'index' variable
being used without an initialization:

drivers/net/ethernet/rocker/rocker_ofdpa.c: In function ‘ofdpa_port_fib_ipv4.isra.16.constprop’:
drivers/net/ethernet/rocker/rocker_ofdpa.c:2425:92: warning: ‘index’ may be used uninitialized in this function [-Wmaybe-uninitialized]

This is a false positive, the logic is just a bit too complex for gcc
to follow here. Moving the intialization of 'index' a little further
down makes it clear to gcc that the function always returns an error
if it is not initialized.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:20:36 -04:00
Arnd Bergmann
52ccd63184 net/hyperv: avoid uninitialized variable
The hdr_offset variable is only if we deal with a TCP or UDP packet,
but as the check surrounding its usage tests for skb_is_gso()
instead, the compiler has no idea if the variable is initialized
or not at that point:

drivers/net/hyperv/netvsc_drv.c: In function ‘netvsc_start_xmit’:
drivers/net/hyperv/netvsc_drv.c:494:42: error: ‘hdr_offset’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

This adds an additional check for the transport type, which
tells the compiler that this path cannot happen. Since the
get_net_transport_info() function should always be inlined
here, I don't expect this to result in additional runtime
checks.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:20:36 -04:00
Arnd Bergmann
4b75ca5a7a net: bcm63xx: avoid referencing uninitialized variable
gcc found a reference to an uninitialized variable in the error handling
of bcm_enet_open, introduced by a recent cleanup:

drivers/net/ethernet/broadcom/bcm63xx_enet.c: In function 'bcm_enet_open'
drivers/net/ethernet/broadcom/bcm63xx_enet.c:1129:2: warning: 'phydev' may be used uninitialized in this function [-Wmaybe-uninitialized]

This makes the use of that variable conditional, so we only reference it
here after it has been used before. Unlike my normal patches, I have not
build-tested this one, as I don't currently have mips test in my
randconfig setup.

Fixes: 625eb8667d ("net: ethernet: broadcom: bcm63xx: use phydev from struct net_device")
Cc: Philippe Reynes <tremyfr@gmail.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:20:36 -04:00
Eric Dumazet
41ee9c557e soreuseport: do not export reuseport_add_sock()
reuseport_add_sock() is not used from a module,
no need to export it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:18:23 -04:00
Thomas Falcon
87737f8810 ibmvnic: Update MTU after device initialization
It is possible for the MTU to be changed during the initialization
process with the VNIC Server.  Ensure that the net device is updated
to reflect the new MTU.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:17:13 -04:00
Thomas Falcon
12608c260d ibmvnic: Fix GFP_KERNEL allocation in interrupt context
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:17:13 -04:00
Thomas Falcon
9fa2f2cca2 ibmvnic: Driver Version 1.0.1
Increment driver version to reflect features that have
been added since release.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 14:17:13 -04:00
Nikolay Aleksandrov
7cb3f9214d bridge: multicast: restore perm router ports on multicast enable
Satish reported a problem with the perm multicast router ports not getting
reenabled after some series of events, in particular if it happens that the
multicast snooping has been disabled and the port goes to disabled state
then it will be deleted from the router port list, but if it moves into
non-disabled state it will not be re-added because the mcast snooping is
still disabled, and enabling snooping later does nothing.

Here are the steps to reproduce, setup br0 with snooping enabled and eth1
added as a perm router (multicast_router = 2):
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show
<empty>

At this point we have mcast enabled and eth1 as a perm router (value = 2)
but it is not in the router list which is incorrect.

After this change:
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show
router ports on br0: eth1

Note: we can directly do br_multicast_enable_port for all because the
querier timer already has checks for the port state and will simply
expire if it's in blocking/disabled. See the comment added by
commit 9aa6638216 ("bridge: multicast: add a comment to
br_port_state_selection about blocking state")

Fixes: 561f1103a2 ("bridge: Add multicast_snooping sysfs toggle")
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 13:52:13 -04:00
Tobias Klauser
7ab488951a tcp: Remove unused but set variable
Remove the unused but set variable icsk in listening_get_next to fix the
following GCC warning when building with 'W=1':

  net/ipv4/tcp_ipv4.c: In function ‘listening_get_next’:
  net/ipv4/tcp_ipv4.c:1890:31: warning: variable ‘icsk’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:34:27 -04:00
Ganesh Goudar
a56177e18f cxgb4: Fix number of queue sets corssing the limit
Do not let number of offload queue sets to go more than
MAX_OFLD_QSETS, which would otherwise crash the driver
on machines with cores more than MAX_OFLD_QSETS.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:32:39 -04:00
Tobias Klauser
7e1670c15c ipv4: Remove unused but set variable
Remove the unused but set variable dev in ip_do_fragment to fix the
following GCC warning when building with 'W=1':

  net/ipv4/ip_output.c: In function ‘ip_do_fragment’:
  net/ipv4/ip_output.c:541:21: warning: variable ‘dev’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:30:28 -04:00
Niklas Cassel
1826277802 dwc_eth_qos: enable flow control by default
Allow autoneg to enable flow control by default.
The behavior when autoneg is off has not changed.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
Acked-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:29:17 -04:00
Niklas Cassel
902943c0d7 dwc_eth_qos: do not clear pause flags from phy_device->supported
phy_device->supported is originally set by the PHY driver.
The ethernet driver should filter phy_device->supported to only contain
flags supported by the IP.
The IP supports setting rx and tx flow control independently,
therefore SUPPORTED_Pause and SUPPORTED_Asym_Pause should not be cleared.
If the flags are cleared, pause frames cannot be enabled (even if they
are supported by the PHY).

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
Acked-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:29:17 -04:00
Tobias Klauser
8d324fd9e1 net/hsr: Remove unused but set variable
Remove the unused but set variable master_dev in check_local_dest to fix
the following GCC warning when building with 'W=1':

  net/hsr/hsr_forward.c: In function ‘check_local_dest’:
  net/hsr/hsr_forward.c:303:21: warning: variable ‘master_dev’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:28:18 -04:00
David S. Miller
5cbee55736 This is relatively small, mostly to get the SG/crypto
from stack removal fix that crashes things when VMAP
 stack is used in conjunction with software crypto.
 
 Aside from that, we have:
  * a fix for AP_VLAN usage with the nl80211 frame command
  * two fixes (and two preparation patches) for A-MSDU, one
    to discard group-addressed (multicast) and unexpected
    4-address A-MSDUs, the other to validate A-MSDU inner
    MAC addresses properly to prevent controlled port bypass
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJYBcgKAAoJEGt7eEactAAdhUkP/jMVQbLMZ1Jcc9+lsPVGUIga
 I9GeQ4lcnD+4ASeJUhTtemC1IMNL4zMVqaIxbznDXKP7rZRrODVvCPk2TYIw9c5S
 rzF/TRierMFttLu3xY757nAsYg6T7F03JdOQ3SKIb3xOD8pXCWQoVRN14ldroRno
 4stOAtDrpD5wvK2JhlWv1EYlxGVLqLcakZt/BwgDX/cJGkAx49Q/s29FUnesB9Ep
 sCH5chffeQskOL9CrSwboNmucgt4HGQORc4UL/KtPOEBtyfu/LCXEKSqAKVyQZtZ
 OerouOHWqQE5lT2K6qD/KKFW4lV2t1h+xzqsvZk4ZR5o3s+PAGai6D/wf+JgY9Hk
 uor9ju/e0htcI9m0aFdHDnltV0OOwIhR2bxWTuBBUkyFVtdQQY+1MRTTtuunWIB4
 SDYv6LrNL/0HAIuTlPQH99rnsFNnRZCtTpdbT7GRckAMeWMvy19bF2ZB1FXuSn+h
 5dxIo0qkw8nv4Y9wQ6QmgOcSzYyidUrCgLTO516qXVAKY0kl/u4q/zPr0Fmx/qfY
 oxspelDv0qd2NMQwJ/AmwjAjkQBulv5DVLu+cDXdOMkc/EbhzWyvetcHiNukxjHn
 mukCBxTlLoDLug2LFkAPIddEutj+VUEefkf/pD/js8uYuyd9ZnPjiIh6fG25il9a
 cHbMYtANt2EnZjwI9Z74
 =T6t1
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2016-10-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
This is relatively small, mostly to get the SG/crypto
from stack removal fix that crashes things when VMAP
stack is used in conjunction with software crypto.

Aside from that, we have:
 * a fix for AP_VLAN usage with the nl80211 frame command
 * two fixes (and two preparation patches) for A-MSDU, one
   to discard group-addressed (multicast) and unexpected
   4-address A-MSDUs, the other to validate A-MSDU inner
   MAC addresses properly to prevent controlled port bypass
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:26:15 -04:00
Ivan Vecera
6bc80629ee bnx2: fix locking when netconsole is used
Functions bnx2_reg_rd_ind(), bnx2_reg_wr_ind() and bnx2_ctx_wr()
can be called with IRQs disabled when netconsole is enabled. So they
should use spin_{,un}lock_irq{save,restore} instead of _bh variants.

Example call flow:
bnx2_poll()
  ->bnx2_poll_link()
    ->bnx2_phy_int()
      ->bnx2_set_remote_link()
        ->bnx2_shmem_rd()
          ->bnx2_reg_rd_ind()
            -> spin_lock_bh(&bp->indirect_lock);
               spin_unlock_bh(&bp->indirect_lock);
               ...
               -> __local_bh_enable_ip

static inline void __local_bh_enable_ip(unsigned long ip)
      WARN_ON_ONCE(in_irq() || irqs_disabled());   <<<<<< WARN

Cc: Sony Chacko <sony.chacko@qlogic.com>
Cc: Dept-HSGLinuxNICDev@qlogic.com
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-18 10:01:51 -04:00
David S. Miller
3dfcb4f56f Merge branch 'net-driver-autoload'
Javier Martinez Canillas says:

====================
net: Fix module autoload for several platform drivers

I noticed that module autoload won't be working in a bunch of platform
drivers in the net subsystem and this patch series contains the fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:04 -04:00
Javier Martinez Canillas
0822b43e4f net: dsa: bcm_sf2: Fix module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/net/dsa/bcm_sf2.ko | grep alias
alias:          platform:brcm-sf2

After this patch:

$ modinfo drivers/net/dsa/bcm_sf2.ko | grep alias
alias:          platform:brcm-sf2
alias:          of:N*T*Cbrcm,bcm7445-switch-v4.0C*
alias:          of:N*T*Cbrcm,bcm7445-switch-v4.0

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:03 -04:00
Javier Martinez Canillas
03eaae5253 net: dsa: b53: Fix module autoload
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/net/dsa/b53/b53_mmap.ko  | grep alias
$

After this patch:

$ modinfo drivers/net/dsa/b53/b53_mmap.ko  | grep alias
alias:          of:N*T*Cbrcm,bcm63xx-switchC*
alias:          of:N*T*Cbrcm,bcm63xx-switch
alias:          of:N*T*Cbrcm,bcm6368-switchC*
alias:          of:N*T*Cbrcm,bcm6368-switch
alias:          of:N*T*Cbrcm,bcm6328-switchC*
alias:          of:N*T*Cbrcm,bcm6328-switch
alias:          of:N*T*Cbrcm,bcm3384-switchC*
alias:          of:N*T*Cbrcm,bcm3384-switch

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:03 -04:00
Javier Martinez Canillas
af40097e3e net: hisilicon: Fix hns_mdio module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/net/ethernet/hisilicon//hns_mdio.ko | grep alias
alias:          platform:Hi-HNS_MDIO
alias:          acpi*:HISI0141:*

After this patch:

$ modinfo drivers/net/ethernet/hisilicon//hns_mdio.ko | grep alias
alias:          platform:Hi-HNS_MDIO
alias:          of:N*T*Chisilicon,hns-mdioC*
alias:          of:N*T*Chisilicon,hns-mdio
alias:          of:N*T*Chisilicon,mdioC*
alias:          of:N*T*Chisilicon,mdio
alias:          acpi*:HISI0141:*

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:03 -04:00
Javier Martinez Canillas
7097268509 net: qcom/emac: Fix module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/net/ethernet/qualcomm/emac/qcom-emac.ko | grep alias
alias:          platform:qcom-emac

After this patch:

$ modinfo drivers/net/ethernet/qualcomm/emac/qcom-emac.ko | grep alias
alias:          platform:qcom-emac
alias:          of:N*T*Cqcom,fsm9900-emacC*
alias:          of:N*T*Cqcom,fsm9900-emac

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:03 -04:00
Javier Martinez Canillas
a7deb924d3 net: hns: Fix hns_dsaf module autoload for OF registration
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/net/ethernet/hisilicon/hns/hns_dsaf.ko | grep alias
alias:          acpi*:HISI00B2:*
alias:          acpi*:HISI00B1:*

After this patch:

$ modinfo drivers/net/ethernet/hisilicon/hns/hns_dsaf.ko | grep alias
alias:          acpi*:HISI00B2:*
alias:          acpi*:HISI00B1:*
alias:          of:N*T*Chisilicon,hns-dsaf-v2C*
alias:          of:N*T*Chisilicon,hns-dsaf-v2
alias:          of:N*T*Chisilicon,hns-dsaf-v1C*
alias:          of:N*T*Chisilicon,hns-dsaf-v1

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:02 -04:00
Javier Martinez Canillas
2fa3e317e6 net: ethernet: nb8800: Fix module autoload
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ $ modinfo drivers/net/ethernet/aurora/nb8800.ko | grep alias
$

After this patch:

$ modinfo drivers/net/ethernet/aurora/nb8800.ko | grep alias
alias:          of:N*T*Csigma,smp8734-ethernetC*
alias:          of:N*T*Csigma,smp8734-ethernet
alias:          of:N*T*Csigma,smp8642-ethernetC*
alias:          of:N*T*Csigma,smp8642-ethernet
alias:          of:N*T*Caurora,nb8800C*
alias:          of:N*T*Caurora,nb8800

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:02 -04:00
Javier Martinez Canillas
fc971a2f23 net: nps_enet: Fix module autoload
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/net/ethernet/ezchip/nps_enet.ko | grep alias
$

After this patch:

$ modinfo drivers/net/ethernet/ezchip/nps_enet.ko | grep alias
alias:          of:N*T*Cezchip,nps-mgt-enetC*
alias:          of:N*T*Cezchip,nps-mgt-enet

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 13:03:02 -04:00
Colin Ian King
67b11e2ea7 cxgb4: fix memory leak of qe on error exit path
A memory leak of qe occurs when t4_sched_queue_unbind fails,
so fix this by free'ing qe on the error exit path.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 11:24:08 -04:00
Eric Dumazet
9a0b1e8ba4 net: pktgen: remove rcu locking in pktgen_change_name()
After Jesper commit back in linux-3.18, we trigger a lockdep
splat in proc_create_data() while allocating memory from
pktgen_change_name().

This patch converts t->if_lock to a mutex, since it is now only
used from control path, and adds proper locking to pktgen_change_name()

1) pktgen_thread_lock to protect the outer loop (iterating threads)
2) t->if_lock to protect the inner loop (iterating devices)

Note that before Jesper patch, pktgen_change_name() was lacking proper
protection, but lockdep was not able to detect the problem.

Fixes: 8788370a1d ("pktgen: RCU-ify "if_list" to remove lock in next_to_run()")
Reported-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 10:52:59 -04:00
David Ahern
a04a480d43 net: Require exact match for TCP socket lookups if dif is l3mdev
Currently, socket lookups for l3mdev (vrf) use cases can match a socket
that is bound to a port but not a device (ie., a global socket). If the
sysctl tcp_l3mdev_accept is not set this leads to ack packets going out
based on the main table even though the packet came in from an L3 domain.
The end result is that the connection does not establish creating
confusion for users since the service is running and a socket shows in
ss output. Fix by requiring an exact dif to sk_bound_dev_if match if the
skb came through an interface enslaved to an l3mdev device and the
tcp_l3mdev_accept is not set.

skb's through an l3mdev interface are marked by setting a flag in
inet{6}_skb_parm. The IPv6 variant is already set; this patch adds the
flag for IPv4. Using an skb flag avoids a device lookup on the dif. The
flag is set in the VRF driver using the IP{6}CB macros. For IPv4, the
inet_skb_parm struct is moved in the cb per commit 971f10eca1, so the
match function in the TCP stack needs to use TCP_SKB_CB. For IPv6, the
move is done after the socket lookup, so IP6CB is used.

The flags field in inet_skb_parm struct needs to be increased to add
another flag. There is currently a 1-byte hole following the flags,
so it can be expanded to u16 without increasing the size of the struct.

Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-17 10:17:05 -04:00
Ard Biesheuvel
f4a067f9ff mac80211: move struct aead_req off the stack
Some crypto implementations (such as the generic CCM wrapper in crypto/)
use scatterlists to map fields of private data in their struct aead_req.
This means these data structures cannot live in the vmalloc area, which
means that they cannot live on the stack (with CONFIG_VMAP_STACK.)

This currently occurs only with the generic software implementation, but
the private data and usage is implementation specific, so move the whole
data structures off the stack into heap by allocating every time we need
to use them.

In addition, take care not to put any of our own stack allocations into
scatterlists. This involves reserving some extra room when allocating the
aead_request structures, and referring to those allocations in the scatter-
lists (while copying the data from the stack before the crypto operation)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-10-17 16:14:04 +02:00
Alexey Khoroshilov
fb5c6cfaec vmxnet3: avoid assumption about invalid dma_pa in vmxnet3_set_mc()
vmxnet3_set_mc() checks new_table_pa returned by dma_map_single()
with dma_mapping_error(), but even there it assumes zero is invalid pa
(it assumes dma_mapping_error(...,0) returns true if new_table is NULL).

The patch adds an explicit variable to track status of new_table_pa.

Found by Linux Driver Verification project (linuxtesting.org).

v2: use "bool" and "true"/"false" for boolean variables.
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15 17:47:32 -04:00
Dan Carpenter
50756ebecf stmmac: fix an error code in stmmac_ptp_register()
PTR_ERR(NULL) is success.  We have to preserve the error code earlier.

Fixes: 7086605a6a ("stmmac: fix error check when init ptp")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15 17:35:30 -04:00
Timur Tabi
93966b715b net: qcom/emac: disable interrupts before calling phy_disconnect
There is a race condition that can occur if EMAC interrupts are
enabled when phy_disconnect() is called.  phy_disconnect() sets
adjust_link to NULL.  When an interrupt occurs, the ISR might
call phy_mac_interrupt(), which wakes up the workqueue function
phy_state_machine().  This function might reference adjust_link,
thereby causing a null pointer exception.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15 17:34:43 -04:00
Ard Biesheuvel
f007643613 r8169: set coherent DMA mask as well as streaming DMA mask
PCI devices that are 64-bit DMA capable should set the coherent
DMA mask as well as the streaming DMA mask. On some architectures,
these are managed separately, and so the coherent DMA mask will be
left at its default value of 32 if it is not set explicitly. This
results in errors such as

     r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
     hwdev DMA mask = 0x00000000ffffffff, dev_addr = 0x00000080fbfff000
     swiotlb: coherent allocation failed for device 0000:02:00.0 size=4096
     CPU: 0 PID: 1062 Comm: systemd-udevd Not tainted 4.8.0+ #35
     Hardware name: AMD Seattle/Seattle, BIOS 10:53:24 Oct 13 2016

on systems without memory that is 32-bit addressable by PCI devices.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-15 17:29:38 -04:00
David S. Miller
9e55d0f954 wireless-drivers fixes for 4.9
wlcore
 
 * fix a double free regression causing hard to track crashes
 
 rtl8xxxu
 
 * fix driver reload issues, a memory leak and an endian bug
 
 rtlwifi
 
 * fix a major regression introduced in 4.9 with firmware loading on
   certain hardware
 
 ath10k
 
 * fix regression about broken cal_data debugfs file (since 4.7)
 
 ath9k
 
 * revert temperature compensation for AR9003+ devices, it was causing
   too much problems
 
 ath6kl
 
 * add Dell OEM SDIO I/O for the Venue 8 Pro
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJYAIKDAAoJEG4XJFUm622b7IcH/iV/mLuZ0Nyh7tsNeSElTjVc
 AEFUdcxVRjM96n9K8AcHZVFqCSMggibOgulGiwHO48DLWOhOdzhffn3GZaRbHTbL
 5o3d527R1kvx+UP8KIFF5MHjagMEC3/AYi417/DiCGH0BM6UcFPFxXOOWePiG0ov
 Lz14xi/8AUrOnREc3TzvDsRUaI2VO3vvXq+IQtMwQBmtczeaKwjjw/RsqkCdkD70
 yqiZVO7FJ3mDH1ybxmkbhlAklG7p9x01e3+gRbnKdEqYiaoYUrRGNE5wq00fuFBU
 Xa3vbQa22csD2ShGHtPnZ6kg4X/q2gLmxNWl6M5TVvwUVD313AQepIcnTiHnwro=
 =TI3A
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2016-10-14' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.9

wlcore

* fix a double free regression causing hard to track crashes

rtl8xxxu

* fix driver reload issues, a memory leak and an endian bug

rtlwifi

* fix a major regression introduced in 4.9 with firmware loading on
  certain hardware

ath10k

* fix regression about broken cal_data debugfs file (since 4.7)

ath9k

* revert temperature compensation for AR9003+ devices, it was causing
  too much problems

ath6kl

* add Dell OEM SDIO I/O for the Venue 8 Pro
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 16:08:13 -04:00
Guenter Roeck
610df1d234 net: asix: Avoid looping when the device does not respond
Check answers from USB stack and avoid re-sending the request
multiple times if the device does not respond.

This fixes the following problem, observed with a probably flaky adapter.

[62108.732707] usb 1-3: new high-speed USB device number 5 using xhci_hcd
[62108.914421] usb 1-3: New USB device found, idVendor=0b95, idProduct=7720
[62108.914463] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[62108.914476] usb 1-3: Product: AX88x72A
[62108.914486] usb 1-3: Manufacturer: ASIX Elec. Corp.
[62108.914495] usb 1-3: SerialNumber: 000001
[62114.109109] asix 1-3:1.0 (unnamed net_device) (uninitialized):
	Failed to write reg index 0x0000: -110
[62114.109139] asix 1-3:1.0 (unnamed net_device) (uninitialized):
	Failed to send software reset: ffffff92
[62119.109048] asix 1-3:1.0 (unnamed net_device) (uninitialized):
	Failed to write reg index 0x0000: -110
...

Since the USB timeout is 5 seconds, and the operation is retried 30 times,
this results in

[62278.180353] INFO: task mtpd:1725 blocked for more than 120 seconds.
[62278.180373]       Tainted: G        W      3.18.0-13298-g94ace9e #1
[62278.180383] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
...
[62278.180957] kworker/2:0     D 0000000000000000     0  5744      2 0x00000000
[62278.180978] Workqueue: usb_hub_wq hub_event
[62278.181029]  ffff880177f833b8 0000000000000046 ffff88017fd00000 ffff88017b126d80
[62278.181048]  ffff880177f83fd8 ffff880065a71b60 0000000000013340 ffff880065a71b60
[62278.181065]  0000000000000286 0000000103b1c199 0000000000001388 0000000000000002
[62278.181081] Call Trace:
[62278.181092]  [<ffffffff8e0971fd>] ? console_conditional_schedule+0x2c/0x2c
[62278.181105]  [<ffffffff8e094f7b>] schedule+0x69/0x6b
[62278.181117]  [<ffffffff8e0972e0>] schedule_timeout+0xe3/0x11d
[62278.181133]  [<ffffffff8daadb1b>] ? trace_timer_start+0x51/0x51
[62278.181146]  [<ffffffff8e095a05>] do_wait_for_common+0x12f/0x16c
[62278.181162]  [<ffffffff8da856a7>] ? wake_up_process+0x39/0x39
[62278.181174]  [<ffffffff8e095aee>] wait_for_common+0x52/0x6d
[62278.181187]  [<ffffffff8e095b3b>] wait_for_completion_timeout+0x13/0x15
[62278.181201]  [<ffffffff8de676ce>] usb_start_wait_urb+0x93/0xf1
[62278.181214]  [<ffffffff8de6780d>] usb_control_msg+0xe1/0x11d
[62278.181230]  [<ffffffffc037d629>] usbnet_write_cmd+0x9c/0xc6 [usbnet]
[62278.181286]  [<ffffffffc03af793>] asix_write_cmd+0x4e/0x7e [asix]
[62278.181300]  [<ffffffffc03afb41>] asix_set_sw_mii+0x25/0x4e [asix]
[62278.181314]  [<ffffffffc03b001d>] asix_mdio_read+0x51/0x109 [asix]
...

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 16:06:54 -04:00
Jesse Brandeburg
85a624403c ethtool: silence warning on bit loss
Sparse was complaining when we went to prototype some code
using ethtool_cmd_speed_set and SPEED_100000, which uses
the upper 16 bits of __u32 speed for the first time.

CHECK
...
.../uapi/linux/ethtool.h:123:28: warning:
  cast truncates bits from constant value (186a0 becomes 86a0)

The warning is actually bogus, as no bits are really lost, but
we can get rid of the sparse warning with this one small change.

Reported-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 16:05:42 -04:00
Brenden Blanco
958b3d396d net/mlx4_en: fixup xdp tx irq to match rx
In cases where the number of tx rings is not a multiple of the number of
rx rings, the tx completion event will be handled on a different core
from the transmit and population of the ring. Races on the ring will
lead to a double-free of the page, and possibly other corruption.

The rings are initialized by default with a valid multiple of rings,
based on the number of cpus, therefore an invalid configuration requires
ethtool to change the ring layout. For instance 'ethtool -L eth0 rx 9 tx
8' will cause packets received on rx0, and XDP_TX'd to tx48, to be
completed on cpu3 (48 % 9 == 3).

Resolve this discrepancy by shifting the irq for the xdp tx queues to
start again from 0, modulo rx_ring_num.

Fixes: 9ecc2d8617 ("net/mlx4_en: add xdp forwarding and data write support")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 11:13:00 -04:00
David S. Miller
fbbfa34c2c Merge branch 'qed-fixes'
Yuval Mintz says:

====================
qed: Fix dependencies and warnings series

The first patch in this series follows Dan Carpenter's reports about
Smatch warnings for recent qed additions and fixes those.

The second patch is the most significant one [and the reason this is
ntended for 'net'] - it's based on Arnd Bermann's suggestion for fixing
compilation issues that were introduced with the roce addition as a result
of certain combinations of qed, qede and qedr Kconfig options.

The third follows the discussion with Arnd and clears a lot of the warnings
that arise when compiling the drivers with "C=1".

Please consider applying this series to 'net'.
====================

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 11:07:23 -04:00
Yuval Mintz
8c93beaf57 qed: Additional work toward cleaning C=1
This cleans many of the warnings that would arise in qed as a
result of compilations with C=1; Most of those are the addition
of missing 'static' to functions, although there are several other
fixes as well.

Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 11:07:22 -04:00
Yuval Mintz
0189efb8f4 qed*: Fix Kconfig dependencies with INFINIBAND_QEDR
The qedr driver would require a tristate Kconfig option [to allow
it to compile as a module], and toward that end we've added the
INFINIBAND_QEDR option. But as we've made the compilation of the
qed/qede infrastructure required for RoCE dependent on the option
we'd be facing linking difficulties in case that QED=y or QEDE=y,
and INFINIBAND_QEDR=m.

To resolve this, we seperate between the INFINIBAND_QEDR option
and the infrastructure support in qed/qede by introducing a new
QED_RDMA option which would be selected by INFINIBAND_QEDR but would
be a boolean instead of a tristate; Following that, the qed/qede is
fixed based on this new option so that all config combinations would
be supported.

Fixes: cee9fbd8e2 ("qede: add qedr framework")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 11:07:22 -04:00
Yuval Mintz
ce6b04ee8b qed: Fix static checker warning.
Smatch compains about qed_roce_ll2_tx() dereference
of the 'cdev' variable while testing its validity later.
As the validation checking is an over-kill [variable would always
be set], simply remove it.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: abd49676c7 ("qed: Add RoCE ll2 & GSI support")
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 11:07:22 -04:00
Jiri Bohac
76506a986d IPv6: fix DESYNC_FACTOR
The IPv6 temporary address generation uses a variable called DESYNC_FACTOR
to prevent hosts updating the addresses at the same time. Quoting RFC 4941:

   ... The value DESYNC_FACTOR is a random value (different for each
   client) that ensures that clients don't synchronize with each other and
   generate new addresses at exactly the same time ...

DESYNC_FACTOR is defined as:

   DESYNC_FACTOR -- A random value within the range 0 - MAX_DESYNC_FACTOR.
   It is computed once at system start (rather than each time it is used)
   and must never be greater than (TEMP_VALID_LIFETIME - REGEN_ADVANCE).

First, I believe the RFC has a typo in it and meant to say: "and must
never be greater than (TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE)"

The reason is that at various places in the RFC, DESYNC_FACTOR is used in
a calculation like (TEMP_PREFERRED_LIFETIME - DESYNC_FACTOR) or
(TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE - DESYNC_FACTOR). It needs to be
smaller than (TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE) for the result of
these calculations to be larger than zero. It's never used in a
calculation together with TEMP_VALID_LIFETIME.

I already submitted an errata to the rfc-editor:
https://www.rfc-editor.org/errata_search.php?rfc=4941

The Linux implementation of DESYNC_FACTOR is very wrong:
max_desync_factor is used in places DESYNC_FACTOR should be used.
max_desync_factor is initialized to the RFC-recommended value for
MAX_DESYNC_FACTOR (600) but the whole point is to get a _random_ value.

And nothing ensures that the value used is not greater than
(TEMP_PREFERRED_LIFETIME - REGEN_ADVANCE), which leads to underflows.  The
effect can easily be observed when setting the temp_prefered_lft sysctl
e.g. to 60. The preferred lifetime of the temporary addresses will be
bogus.

TEMP_PREFERRED_LIFETIME and REGEN_ADVANCE are not constants and can be
influenced by these three sysctls: regen_max_retry, dad_transmits and
temp_prefered_lft. Thus, the upper bound for desync_factor needs to be
re-calculated each time a new address is generated and if desync_factor is
larger than the new upper bound, a new random value needs to be
re-generated.

And since we already have max_desync_factor configurable per interface, we
also need to calculate and store desync_factor per interface.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 10:59:15 -04:00
Jiri Bohac
9d6280da39 IPv6: Drop the temporary address regen_timer
The randomized interface identifier (rndid) was periodically updated from
the regen_timer timer. Simplify the code by updating the rndid only when
needed by ipv6_try_regen_rndid().

This makes the follow-up DESYNC_FACTOR fix much simpler.  Also it fixes a
reference counting error in this error path, where an in6_dev_put was
missing:
		err = addrconf_sysctl_register(ndev);
		if (err) {
			ipv6_mc_destroy_dev(ndev);
	-               del_timer(&ndev->regen_timer);
			snmp6_unregister_dev(ndev);
			goto err_release;

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 10:59:15 -04:00
Paolo Abeni
fc791b6335 IB/ipoib: move back IB LL address into the hard header
After the commit 9207f9d45b ("net: preserve IP control block
during GSO segmentation"), the GSO CB and the IPoIB CB conflict.
That destroy the IPoIB address information cached there,
causing a severe performance regression, as better described here:

http://marc.info/?l=linux-kernel&m=146787279825501&w=2

This change moves the data cached by the IPoIB driver from the
skb control lock into the IPoIB hard header, as done before
the commit 936d7de3d7 ("IPoIB: Stop lying about hard_header_len
and use skb->cb to stash LL addresses").
In order to avoid GRO issue, on packet reception, the IPoIB driver
stash into the skb a dummy pseudo header, so that the received
packets have actually a hard header matching the declared length.
To avoid changing the connected mode maximum mtu, the allocated
head buffer size is increased by the pseudo header length.

After this commit, IPoIB performances are back to pre-regression
value.

v2 -> v3: rebased
v1 -> v2: avoid changing the max mtu, increasing the head buf size

Fixes: 9207f9d45b ("net: preserve IP control block during GSO segmentation")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 10:54:53 -04:00
David S. Miller
f1f081cef0 RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAV/+wwfSw1s6N8H32AQLpVA/+NByreKyI8cHL1zgz816iTrzYbEP5Gbtw
 RI9TI5iweUa9ySe4PFQUw+VC0yaAP9brY8tTtss8KHk808Wu4xhlg8fAClOaZXwy
 WmHASdwnRaDWguEpPHyHRST+s9ZO/VD5vwGhREB/hojzdzd135bq1d6GKaHoLFx2
 XDwDeyZc1z+aSzdMCoQuKJlqw9mfujsIOK5xZJ/h/JquYJ3iER55vdofettNCT+S
 hjueVBgWV988oORBtduPrfUYBbQI83QyiWl0xdo6QXWAoN784NfpVti8YAA9B7To
 qfup5aE6ky3LhuRD8GS00yWb96b43FGuPqt27LTH7SGnALX7KbETbBcgasyWqFeV
 UvPbVlk5R+0OXLxqOHvn20gRFS2c6HIdVAW6h7QHB0qwnLS1JJJToAczhK4QTsHQ
 eOXSGK4Gj0CiTd23bL7egaULKnD7eiZbagoty1UL05k8TAgPRnXXaMCRc0cupvKo
 5Amk7xT7ZN1Iyh9TQSt2MRIBIG6AOJogsmjSqgKJZY4YdCD5rYAWyAzc7K2l/L0e
 oUwDaPFWwNAfVYxJIXSd223squyxaXen0B+NlY501tqw4Ce4NnuEssqsSWk6OTHZ
 R9HCT200HvvZthPOStP7rdFiQ6VRDH3aSdl3zU4ila7cxr4wfdVslShuj91OiLqI
 /7zodCmls/8=
 =7krl
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20161013' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Fixes

This set of patches contains a bunch of fixes:

 (1) Fix use of kunmap() after change from kunmap_atomic() within AFS.

 (2) Don't use of ERR_PTR() with an always zero value.

 (3) Check the right error when using ip6_route_output().

 (4) Be consistent about whether call->operation_ID is BE or CPU-E within
     AFS.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-14 10:44:45 -04:00