linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-29 05:44:11 +00:00

Author	SHA1	Message	Date
Joe Stringer	8282f27449	inet: frag: Always orphan skbs inside ip_defrag() Later parts of the stack (including fragmentation) expect that there is never a socket attached to frag in a frag_list, however this invariant was not enforced on all defrag paths. This could lead to the BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the end of this commit message. While the call could be added to openvswitch to fix this particular error, the head and tail of the frags list are already orphaned indirectly inside ip_defrag(), so it seems like the remaining fragments should all be orphaned in all circumstances. kernel BUG at net/ipv4/ip_output.c:586! [...] Call Trace: <IRQ> [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch] [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch] [<ffffffff81667830>] ? dst_discard_out+0x20/0x20 [<ffffffff81667810>] ? dst_ifdown+0x80/0x80 [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch] [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack] [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch] [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch] [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch] [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch] [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch] [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch] [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch] [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch] [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch] [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch] [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch] [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0 [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0 [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930 [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930 [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90 [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220 [<ffffffff81661060>] dev_queue_xmit+0x10/0x20 [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220 [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590 [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590 [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590 [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0 [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0 [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80 [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340 [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190 [<ffffffff816ab4c0>] ip_output+0x70/0x110 [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80 [<ffffffff816aa9f9>] ip_local_out+0x39/0x70 [<ffffffff816abf89>] ip_send_skb+0x19/0x40 [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40 [<ffffffff816df21a>] icmp_push_reply+0xea/0x120 [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230 [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70 [<ffffffff816dfa06>] icmp_echo+0x36/0x70 [<ffffffff816e0d11>] icmp_rcv+0x271/0x450 [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0 [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0 [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0 [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560 [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560 [<ffffffff816a5453>] ip_rcv+0x283/0x3e0 [<ffffffff810b6302>] ? match_held_lock+0x192/0x200 [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40 [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0 [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230 [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90 [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60 [<ffffffff8165e678>] process_backlog+0x78/0x230 [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230 [<ffffffff8165e355>] net_rx_action+0x155/0x400 [<ffffffff8106b48c>] __do_softirq+0xcc/0x420 [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590 [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff8106b88e>] do_softirq+0x4e/0x60 [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0 [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590 [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0 [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0 [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0 [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80 [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340 [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190 [<ffffffff816ab4c0>] ip_output+0x70/0x110 [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80 [<ffffffff816aa9f9>] ip_local_out+0x39/0x70 [<ffffffff816abf89>] ip_send_skb+0x19/0x40 [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40 [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90 [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70 [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0 [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0 [<ffffffff8163e398>] sock_sendmsg+0x38/0x50 [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270 [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320 [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40 [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460 [<ffffffff81204886>] ? __fget_light+0x66/0x90 [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80 [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20 [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 RIP [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0 RSP <ffff88006d603170> Fixes: `7f8a436eaa` ("openvswitch: Add conntrack action") Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 16:00:46 -08:00
David S. Miller	2cc5e4caf4	Merge branch 'sctp-transport-races' Xin Long says: ==================== fix the transport dead race check by using atomic_add_unless on refcnt sctp: fix the transport dead race check by using atomic_add_unless on refcnt sctp: hold transport before we access t->asoc in sctp proc sctp: remove the dead field of sctp_transport ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:59:33 -08:00
Xin Long	47faa1e4c5	sctp: remove the dead field of sctp_transport After we use refcnt to check if transport is alive, the dead can be removed from sctp_transport. The traversal of transport_addr_list in procfs dump is using list_for_each_entry_rcu, no need to check if it has been freed. sctp_generate_t3_rtx_event and sctp_generate_heartbeat_event is protected by sock lock, it's not necessary to check dead, either. also, the timers are cancelled when sctp_transport_free() is called, that it doesn't wait for refcnt to reach 0 to cancel them. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:59:32 -08:00
Xin Long	fba4c330c5	sctp: hold transport before we access t->asoc in sctp proc Previously, before rhashtable, /proc assoc listing was done by read-locking the entire hash entry and dumping all assocs at once, so we were sure that the assoc wasn't freed because it wouldn't be possible to remove it from the hash meanwhile. Now we use rhashtable to list transports, and dump entries one by one. That is, now we have to check if the assoc is still a good one, as the transport we got may be being freed. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:59:32 -08:00
Xin Long	1eed677933	sctp: fix the transport dead race check by using atomic_add_unless on refcnt Now when __sctp_lookup_association is running in BH, it will try to check if t->dead is set, but meanwhile other CPUs may be freeing this transport and this assoc and if it happens that __sctp_lookup_association checked t->dead a bit too early, it may think that the association is still good while it was already freed. So we fix this race by using atomic_add_unless in sctp_transport_hold. After we get one transport from hashtable, we will hold it only when this transport's refcnt is not 0, so that we can make sure t->asoc cannot be freed before we hold the asoc again. Note that sctp association is not freed using RCU so we can't use atomic_add_unless() with it as it may just be too late for that either. Fixes: `4f00878126` ("sctp: apply rhashtable api to send/recv path") Reported-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:59:32 -08:00
David S. Miller	2baaa2d138	Merge branch 'mlxsw-fixes' Jiri Pirko says: ==================== mlxsw: driver fixes Couple of various mlxsw driver fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:32 -08:00
Ido Schimmel	bbeeda27ab	mlxsw: reg: Use correct offset in field definiton The rx_lane, tx_lane and module fields in the PMLP register don't have an additional offset besides the base one (0x04), so set it to 0x00. Fixes: `4ec14b7634` ("mlxsw: Add interface to access registers and process events") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:32 -08:00
Ido Schimmel	3f47f86781	mlxsw: spectrum: Compare local ports instead of pointers When dumping the FDB we can't compare the actual pointers of the ports structs, as it's possible the struct represents a vPort instead of the underlying physical port. Solve this by comparing the local port number instead, as it's shared between the physical ports and all the vPorts on top of him. Fixes: `54a732018d` ("mlxsw: spectrum: Adjust switchdev ops for VLAN devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:32 -08:00
Ido Schimmel	304f51584f	mlxsw: spectrum: Dump LAG FDB records only once LAG FDB records can only point to LAG devices or VLAN devices configured on top of them. Therefore, when dumping the FDB we shouldn't associate these records with the underlying physical ports. Fixes: `8a1ab5d766` ("mlxsw: spectrum: Implement FDB add/remove/dump for LAG") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:31 -08:00
Ido Schimmel	e43aca22ed	mlxsw: spectrum: Use correct netdev when notifying bridge LAG FDB entries pointing to VLAN devices should be reported to the bridge with the matching VLAN device and not the underlying LAG device. Fixes: `aac78a4408` ("mlxsw: spectrum: Adjust FDB notifications for VLAN devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:31 -08:00
Ido Schimmel	004f85ea82	mlxsw: spectrum: Don't report VLAN for 802.1D FDB entries When dumping the hardware FDB we should report entries pointing to VLAN devices with VLAN 0, as packets coming into the bridge are untagged. Likewise, pass FDB_{ADD,DEL} notifications with VLAN 0 for these devices. Fixes: `54a732018d` ("mlxsw: spectrum: Adjust switchdev ops for VLAN devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:31 -08:00
Ido Schimmel	45827d780a	mlxsw: spectrum: Notify bridge's FDB only based on learning_sync When we disable learning on bridge port we should still update the software bridge's FDB when entry pointing to this bridge port is aged-out. We can otherwise have an inconsistency between software and hardware tables. Fixes: `8a1ab5d766` ("mlxsw: spectrum: Implement FDB add/remove/dump for LAG") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:31 -08:00
Ido Schimmel	454911333b	mlxsw: spectrum: Disable learning according to STP state When port is put into LISTENING state it shouldn't populate the FDB, so set the port's STP state in hardware to DISCARDING instead of LEARNING. It will therefore keep listening to BPDU packets, but discard other non-control packets and won't perform any learning. Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:31 -08:00
Ido Schimmel	9cb026ebb8	mlxsw: spectrum: Don't forward packets when STP state is DISABLED When STP state is set to DISABLED the port is assumed to be inactive, but currently we forward packets ingressing through it. Instead, set the port's STP state in hardware to DISCARDING, which means it doesn't forward packets or perform any learning, but it does trap control packets. However, these packets will be dropped by bridge code, which results in the expected behavior. Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:30 -08:00
Ido Schimmel	039c49a6e6	mlxsw: spectrum: Flush FDB when leaving bridge As explained in previous commit, we should always take care of flushing the FDB in the driver and not rely on bridge code. We need to distinguish between two cases with regards to LAG: 1) Port is leaving LAG while LAG is bridged (or VLAN devices on top of it). In this case don't flush the FDB entries pointing to the LAG ID, as this will affect other ports still member in the LAG. Only flush the FDB when the last port in the LAG is leaving the bridge. 2) LAG device is leaving the bridge. In this case the CHANGEUPPER event is simply propagated to each member port, so make each port flush the FDB in its turn. Note that emptying a bridged LAG from ports creates an inconsistency between hardware and software. A user who later (< ageing_time) re-populates the LAG won't have any FDB entries pointing to the LAG ID in hardware, but they will be present in the software bridge's FDB. Currently there is no good solution to this problem, but this will be addressed by us in the future. In order to optimize the flushing process, flush by port or LAG ID if there are no VLAN interfaces on top of the port. Otherwise, flush using (Port / LAG ID, FID=VID} for each of the lower 4K FIDs. In the case of VLAN device simply flush using {Port / LAG ID, vFID} with the vFID to which the VLAN device is mapped to. Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:30 -08:00
Ido Schimmel	4193327135	mlxsw: reg: Add the Switch Filtering DB Flush register When removing a net device from a bridge we should flush the FDB entries associated with this net device. Up until now, we relied upon bridge code to do that for us, but it is possible for user to prevent hardware from syncing with the software bridge (learning_sync=0), so we need to flush overselves. Add the Switch Filtering DB Flush (SFDF) register that is used to flush FDB entries according to different parameters (per-port, per-FID etc). Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:30 -08:00
Ido Schimmel	4dc236c317	mlxsw: spectrum: Handle port leaving LAG while bridged It is possible for a user to remove a port from a LAG device, while the LAG device or VLAN devices on top of it are bridged. In these cases, bridge's teardown sequence is never issued, so we need to take care of it ourselves. When LAG's unlinking event is received by port netdev: 1) Traverse its vPorts list and make those member in a bridge leave it. They will be deleted later by LAG code. 2) Make the port netdev itself leave its bridge if member in one. Fixes: `0d65fc1304` ("mlxsw: spectrum: Implement LAG port join/leave") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-28 15:55:30 -08:00
David S. Miller	3b9e948809	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2016-01-25 This series contains updates to i40e only and so I won't continue receiving patches to fix the same issue (again). Arnd fixes the driver from causing the compiler whining about uninitialized variables, so initialize those variables. Eric fixes the build errors/warnings which were introduced by Anjali when she added geneve support to i40e. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 22:51:42 -08:00
Arnd Bergmann	79febbc19b	net: i40e: shut up uninitialized variable warnings intel/i40e/i40e_txrx.c: In function 'i40e_xmit_frame_ring': intel/i40e/i40e_txrx.c:2367:20: error: 'oiph' may be used uninitialized in this function [-Werror=maybe-uninitialized] intel/i40e/i40e_txrx.c:2317:16: note: 'oiph' was declared here intel/i40e/i40e_txrx.c:2367:17: error: 'oudph' may be used uninitialized in this function [-Werror=maybe-uninitialized] intel/i40e/i40e_txrx.c:2316:17: note: 'oudph' was declared here Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-01-25 15:49:36 -08:00
Eric Dumazet	5cae7615b6	i40e: fix build warnings Fixes following build warnings : drivers/net/ethernet/intel/i40e/i40e_main.c:7057:13: warning: 'i40e_sync_udp_filters_subtask' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8524:13: warning: 'i40e_add_vxlan_port' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8569:13: warning: 'i40e_del_vxlan_port' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8604:13: warning: 'i40e_add_geneve_port' defined but not used [-Wunused-function] drivers/net/ethernet/intel/i40e/i40e_main.c:8651:13: warning: 'i40e_del_geneve_port' defined but not used [-Wunused-function] Fixes: `6a89902405` ("i40e: geneve tunnel offload support") Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-01-25 15:39:24 -08:00
Haiyang Zhang	c85e492445	hv_netvsc: Fix book keeping of skb during batching process Since eliminating send_completion_tid from struct hv_netvsc_packet, we haven't add proper book keeping for the skb of the batched packet. This patch fixes this issue and allows the previous skb is properly freed. Otherwise, a panic may happen. Thanks to Simon Xiao <sixiao@microsoft.com> for bisecting and analysis. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:53 -08:00
Vitaly Kuznetsov	757647e10e	hv_netvsc: use skb_get_hash() instead of a homegrown implementation Recent changes to 'struct flow_keys' (e.g commit `d34af823ff` ("net: Add VLAN ID to flow_keys")) introduced a performance regression in netvsc driver. Is problem is, however, not the above mentioned commit but the fact that netvsc_set_hash() function did some assumptions on the struct flow_keys data layout and this is wrong. Get rid of netvsc_set_hash() by switching to skb_get_hash(). This change will also imply switching to Jenkins hash from the currently used Toeplitz but it seems there is no good excuse for Toeplitz to stay. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:53 -08:00
Thadeu Lima de Souza Cascardo	87e57399e9	sit: set rtnl_link_ops before calling register_netdevice When creating a SIT tunnel with ip tunnel, rtnl_link_ops is not set before ipip6_tunnel_create is called. When register_netdevice is called, there is no linkinfo attribute in the NEWLINK message because of that. Setting rtnl_link_ops before calling register_netdevice fixes that. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:53 -08:00
Johannes Berg	05f3b50ea8	net: fec: use CONFIG_ARM instead of CONFIG_ARCH_MXC/SOC_IMX28 As Arnd Bergmann points out, using CONFIG_ARCH_MXC and/or SOC_IMX28 is wrong if some other ARM platform uses this device - the operation of the driver would depend on an unrelated ARM platform that might or might not be set for multi-platform kernels. Prior to my previous patch, any other platforms using it would have been broken already due to having the cbd_datlen/cbd_sc fields in the wrong order, but byte ordering correctly, so no such platforms can exist and work today. In any case, it seems likely that only Freescale SoCs use this part, and those are little-endian on ARM, so CONFIG_ARM is safe for them. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:53 -08:00
Sudip Mukherjee	62f2aaabcf	defxx: fix build warning We are getting many build warnings about: 'bar_start' may be used uninitialized and 'bar_len' may be used uninitialized They are not actually uninitialized as dfx_get_bars() will initialize them properly. But still lets have them initialized just to satisfy the compiler (gcc 4.8.2). Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:52 -08:00
Sudip Mukherjee	36df745536	net: macb: fix build warning We are getting build warning about: macb.c:2889:13: warning: 'tx_clk' may be used uninitialized in this function macb.c:2888:11: warning: 'hclk' may be used uninitialized in this function In reality they are not used uninitialized as clk_init() will initialize them, this patch will just silence the warning. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:52 -08:00
Johannes Berg	5cfa30397b	net: fec: make driver endian-safe The driver treats the device descriptors as CPU-endian, which appears to be correct with the default endianness on both ARM (typically LE) and PowerPC (typically BE) SoCs, indicating that the hardware block is generated differently. Add endianness annotations and byteswaps as necessary. It's not clear that the ifdef there really is correct and shouldn't just be #ifdef CONFIG_ARM, but I also can't test on anything but the i.MX6 HummingBoard where this gets it working with a BE kernel. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:51:52 -08:00
Russell King	db0e51afa4	net: dsa: fix mv88e6xxx switches Since commit `76e398a627` ("net: dsa: use switchdev obj for VLAN add/del ops"), the Marvell 88E6xxx switch has been unable to pass traffic between ports - any received traffic is discarded by the switch. Taking a port out of bridge mode and configuring a vlan on it also the port to start passing traffic. With the debugfs files re-instated to allow debug of this issue by comparing the register settings between the working and non-working case, the reason becomes clear: GLOBAL GLOBAL2 SERDES 0 1 2 3 4 5 6 - 7: 1111 707f 2001 2 2 2 2 2 0 2 + 7: 1111 707f 2001 1 1 1 1 1 0 1 Register 7 for the ports is the default vlan tag register, and in the non-working setup, it has been set to 2, despite vlan 2 not being configured. This causes the switch to drop all packets coming in to these ports. The working setup has the default vlan tag register set to 1, which is the default vlan when none is configured. Inspection of the code reveals why. The code prior to this commit was: - for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) { ... - if (!err && vlan->flags & BRIDGE_VLAN_INFO_PVID) - err = ds->drv->port_pvid_set(ds, p->port, vid); but the new code is: + for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) { ... + } ... + if (pvid) + err = _mv88e6xxx_port_pvid_set(ds, port, vid); This causes the new code to always set the default vlan to one higher than the old code. Fix this. Fixes: `76e398a627` ("net: dsa: use switchdev obj for VLAN add/del ops") Cc: <stable@vger.kernel.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:49:15 -08:00
Martin Roth	426f04684b	82xx: FCC: Fixing a bug causing to FCC port lock-up (second try) This is an additional patch to the one already submitted recently. The previous patch was not complete, and the FCC port lock-up scenario has been reproduced in lab. I had an opportunity to check the current patch in lab and the FCC port lock no longer freezes, while the previous patch still locks-up the FCC port. The current patch fixes a pointer arithmetic bug (second bug in the same line), which leads FCC port lock-up during underrun/collision handling. Within the tx_startup() function in mac-fcc.c, the address of last BD is not calculated correctly. As a result of wrong calculation of the last BD address, the next transmitted BD may be set to an area out of the transmit BD ring. This actually causes to port lock-up and it is not recoverable. Signed-off-by: Martin Roth <martin.roth@motorolasolutions.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:47:01 -08:00
Thomas Egerer	32b6170ca5	ipv4+ipv6: Make INET_ESP select CRYPTO_ECHAINIV The ESP algorithms using CBC mode require echainiv. Hence INET_ESP have to select CRYPTO_ECHAINIV in order to work properly. This solves the issues caused by a misconfiguration as described in [1]. The original approach, patching crypto/Kconfig was turned down by Herbert Xu [2]. [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html [2] http://marc.info/?l=linux-crypto-vger&m=145224655809562&w=2 Signed-off-by: Thomas Egerer <hakke_007@gmx.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-25 10:45:41 -08:00
Marcelo Ricardo Leitner	27f7ed2b11	sctp: allow setting SCTP_SACK_IMMEDIATELY by the application This patch extends commit `b93d647174` ("sctp: implement the sender side for SACK-IMMEDIATELY extension") as it didn't white list SCTP_SACK_IMMEDIATELY on sctp_msghdr_parse(), causing it to be understood as an invalid flag and returning -EINVAL to the application. Note that the actual handling of the flag is already there in sctp_datamsg_from_user(). https://tools.ietf.org/html/rfc7053#section-7 Fixes: `b93d647174` ("sctp: implement the sender side for SACK-IMMEDIATELY extension") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-24 22:43:21 -08:00
Arnd Bergmann	facc432faa	net: simplify napi_synchronize() to avoid warnings The napi_synchronize() function is defined twice: The definition for SMP builds waits for other CPUs to be done, while the uniprocessor variant just contains a barrier and ignores its argument. In the mvneta driver, this leads to a warning about an unused variable when we lookup the NAPI struct of another CPU and then don't use it: ethernet/marvell/mvneta.c: In function 'mvneta_percpu_notifier': ethernet/marvell/mvneta.c:2910:30: error: unused variable 'other_port' [-Werror=unused-variable] There are no other CPUs on a UP build, so that code never runs, but gcc does not know this. The nicest solution seems to be to turn the napi_synchronize() helper into an inline function for the UP case as well, as that leads gcc to not complain about the argument being unused. Once we do that, we can also combine the two cases into a single function definition and use if(IS_ENABLED()) rather than #ifdef to make it look a bit nicer. The warning first came up in linux-4.4, but I failed to catch it earlier. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `f864288544` ("net: mvneta: Statically assign queues to CPUs") Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-24 22:19:55 -08:00
Hannes Frederic Sowa	9a368aff9c	pptp: fix illegal memory access caused by multiple bind()s Several times already this has been reported as kasan reports caused by syzkaller and trinity and people always looked at RCU races, but it is much more simple. :) In case we bind a pptp socket multiple times, we simply add it to the callid_sock list but don't remove the old binding. Thus the old socket stays in the bucket with unused call_id indexes and doesn't get cleaned up. This causes various forms of kasan reports which were hard to pinpoint. Simply don't allow multiple binds and correct error handling in pptp_bind. Also keep sk_state bits in place in pptp_connect. Fixes: `00959ade36` ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)") Cc: Dmitry Kozlov <xeb@mail.ru> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Dmitry Vyukov <dvyukov@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dave Jones <davej@codemonkey.org.uk> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-24 22:18:26 -08:00
Iyappan Subramanian	b5d7a06906	drivers: net: xgene: fix extra IRQ issue For interrupt controller that doesn't support irq_disable and hardware with level interrupt, an extra interrupt may be pending. This patch fixes the issue by setting IRQ_DISABLE_UNLAZY flag for the interrupt line, as suggested by, 'commit `e9849777d0` ("genirq: Add flag to force mask in disable_irq[_nosync]()")' Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Tested-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-24 22:15:56 -08:00
Eric Dumazet	fa0dc04df2	af_unix: fix struct pid memory leak Dmitry reported a struct pid leak detected by a syzkaller program. Bug happens in unix_stream_recvmsg() when we break the loop when a signal is pending, without properly releasing scm. Fixes: `b3ca9b02b0` ("net: fix multithreaded signal handling in unix recv routines") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-24 22:04:49 -08:00
Johannes Weiner	4877be9019	net: sock: remove dead cgroup methods from struct proto The cgroup methods are no longer used after `baac50bbc3` ("net: tcp_memcontrol: simplify linkage between socket and page counter"). The hunk to delete them was included in the original patch but must have gotten lost during conflict resolution on the way upstream. Fixes: `baac50bbc3` ("net: tcp_memcontrol: simplify linkage between socket and page counter") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 14:16:51 -08:00
Teresa Remmet	0a9c453eef	net: phy: smsc: Fix disabling energy detect mode When the lan87xx_read_status function is getting called the energy detect mode is enabled again even if it has been disabled by device tree. Added private struct to check the energy detect status. Signed-off-by: Teresa Remmet <t.remmet@phytec.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:06:03 -08:00
David S. Miller	8e0c2ab262	Merge branch 'mvneta-multi-clk' Jisheng Zhang says: ==================== net: mvneta: support more than one clk Some platforms may provide more than one clk for the mvneta IP, for example Marvell BG4CT provides "core" clk for the mac core, and "axi" clk for the AXI bus logic. This series tries to addess the "more than one clk" issue. Note: to support BG4CT, we have lots of refactor work to do, eg. BG4CT doesn't have mbus concept etc. Since v2: - Name the optional clock as "bus", which is a bit more flexible. Since v1: - Add Thomas Acks to patch1 and patch2. - make sure the headers are really sorted (some headers are still unsorted in v1). - disable axi clk before disabling core clk, Thank Thomas. - update dt binding as Thomas suggested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:04:59 -08:00
Jisheng Zhang	e308cb835c	net: mvneta: update clocks property and document additional clock-names Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:04:59 -08:00
Jisheng Zhang	15cc4a4a99	net: mvneta: get optional bus clk Some platforms may provide more than one clk for the mvneta IP, for example Marvell BG4CT provides one clk for the mac core, and one clk for the AXI bus logic. Obviously this bus clk also need to be enabled. This patch adds this optional "bus" clk support. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:04:59 -08:00
Jisheng Zhang	2804ba4ede	net: mvneta: Try to get named core clock first Some platforms may provide more than one clk for the mvneta IP, for example Marvell BG4CT provides one clk for the mac core, and one clk for the AXI bus logic. To support for more than one clock, we'll need to distinguish between the clock by name. Change clock probing to first try to get "core" clock before falling back to unnamed clock. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:04:59 -08:00
Jisheng Zhang	0e03f563a0	net: mvneta: sort the headers in alphabetic order Sorting the headers in alphabetic order will help to reduce the conflict when adding new headers in the future. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:04:59 -08:00
Jisheng Zhang	2c832293e0	net: mvneta: fix trivial cut-off issue in mvneta_ethtool_update_stats When s->type is T_REG_64, the high 32bits are lost in val. This patch fixes this trivial issue. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Fixes: `9b0cdefa4c` ("net: mvneta: add ethtool statistics") Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:03:30 -08:00
yankejian	48189d6aaf	net: hns: enet specifies a reference to dsaf This patch replace the assoication between dsaf and enet from string matching to object reference. It requires the DTS to be updated within BIOS. Thanks god it can be done for all released boards. Signed-off-by: Kejian Yan <yankejian@huawei.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Yisen Zhuang <yisen.zhuang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 12:02:31 -08:00
Eric Dumazet	e62a123b8e	tcp: fix NULL deref in tcp_v4_send_ack() Neal reported crashes with this stack trace : RIP: 0010:[<ffffffff8c57231b>] tcp_v4_send_ack+0x41/0x20f ... CR2: 0000000000000018 CR3: 000000044005c000 CR4: 00000000001427e0 ... [<ffffffff8c57258e>] tcp_v4_reqsk_send_ack+0xa5/0xb4 [<ffffffff8c1a7caa>] tcp_check_req+0x2ea/0x3e0 [<ffffffff8c19e420>] tcp_rcv_state_process+0x850/0x2500 [<ffffffff8c1a6d21>] tcp_v4_do_rcv+0x141/0x330 [<ffffffff8c56cdb2>] sk_backlog_rcv+0x21/0x30 [<ffffffff8c098bbd>] tcp_recvmsg+0x75d/0xf90 [<ffffffff8c0a8700>] inet_recvmsg+0x80/0xa0 [<ffffffff8c17623e>] sock_aio_read+0xee/0x110 [<ffffffff8c066fcf>] do_sync_read+0x6f/0xa0 [<ffffffff8c0673a1>] SyS_read+0x1e1/0x290 [<ffffffff8c5ca262>] system_call_fastpath+0x16/0x1b The problem here is the skb we provide to tcp_v4_send_ack() had to be parked in the backlog of a new TCP fastopen child because this child was owned by the user at the time an out of window packet arrived. Before queuing a packet, TCP has to set skb->dev to NULL as the device could disappear before packet is removed from the queue. Fix this issue by using the net pointer provided by the socket (being a timewait or a request socket). IPv6 is immune to the bug : tcp_v6_send_response() already gets the net pointer from the socket if provided. Fixes: `168a8f5805` ("tcp: TCP Fast Open Server - main code path") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 11:20:14 -08:00
Jesse Gross	35e2d1152b	tunnels: Allow IPv6 UDP checksums to be correctly controlled. When configuring checksums on UDP tunnels, the flags are different for IPv4 vs. IPv6 (and reversed). However, when lightweight tunnels are enabled the flags used are always the IPv4 versions, which are ignored in the IPv6 code paths. This uses the correct IPv6 flags, so checksums can be controlled appropriately. Fixes: `a725e514` ("vxlan: metadata based tunneling for IPv6") Fixes: `abe492b4` ("geneve: UDP checksum configuration via netlink") Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 11:10:40 -08:00
David S. Miller	4f2aaf7dd9	Merge branch 'fix-phy-ignore-interrupts' Florian Fainelli says: ==================== net: phy: Finally fix PHY_IGNORE_INTERRUPTS This patch series finally fixes how PHY_IGNORE_INTERRUPTS are treated by avoiding to poll the PHY and getting notified from link state changes by the Ethernet MAC interrupt service routine. Tested with bcmgenet since this is the HW that I have access to. Targetting the "net" tree since these are bugfixes, but I would like Woojun and Andrew to take a look and test that on their respective HW setups as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 10:48:51 -08:00
Florian Fainelli	49f7a471e4	net: bcmgenet: Properly configure PHY to ignore interrupt By the time we execute bcmgenet_mii_probe(), the MDIO bus structure has long been allocated and registered. Overirring the PHY interrupt using the MDIO bus structure has no chance to work anymore, because of_mdiobus_register() has call phy_device_create() for use, which copied the MDIO bus address's irq for the PHY into the PHY device "irq" member. Since we do have a proper reference to a PHY device in bcmgenet_mii_probe(), just assign the desired IRQ value here. Fixes: `aa09677cba` ("net: bcmgenet: add MDIO routines") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 10:48:51 -08:00
Florian Fainelli	deccd16f91	net: phy: Fix phy_mac_interrupt() Commit `5ea94e7686` ("phy: add phy_mac_interrupt()") to use with PHY_IGNORE_INTERRUPT added a cancel_work_sync() into phy_mac_interrupt() which is allowed to sleep, whereas phy_mac_interrupt() is expected to be callable from interrupt context. Now that we have fixed how the PHY state machine treats PHY_IGNORE_INTERRUPT with respect to state changes, we can just set the new link state, and queue the PHY state machine for execution so it is going to read the new link state. For that to work properly, we need to update phy_change() not to try to invoke any interrupt callbacks if we have configured the PHY device for PHY_IGNORE_INTERRUPT, because that PHY device and its driver are not required to implement those. Fixes: `5ea94e7686` ("phy: add phy_mac_interrupt() to use with PHY_IGNORE_INTERRUPT") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 10:48:51 -08:00
Florian Fainelli	d5c3d84657	net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS Commit `2c7b49212a` ("phy: fix the use of PHY_IGNORE_INTERRUPT") changed a hunk in phy_state_machine() in the PHY_RUNNING case which was not needed. The change essentially makes the PHY library treat PHY devices with PHY_IGNORE_INTERRUPT to keep polling for the PHY device, even though the intent is not to do it. Fix this by reverting that specific hunk, which makes the PHY state machine wait for state changes, and stay in the PHY_RUNNING state for as long as needed. Fixes: `2c7b49212a` ("phy: fix the use of PHY_IGNORE_INTERRUPT") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-01-21 10:48:51 -08:00

1 2 3 4 5 ...

571296 commits