linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-23 09:19:51 +00:00

Author	SHA1	Message	Date
Vincent Guittot	6b94780e45	sched/core: Use load_avg for selecting idlest group find_idlest_group() only compares the runnable_load_avg when looking for the least loaded group. But on fork intensive use case like hackbench where tasks blocked quickly after the fork, this can lead to selecting the same CPU instead of other CPUs, which have similar runnable load but a lower load_avg. When the runnable_load_avg of 2 CPUs are close, we now take into account the amount of blocked load as a 2nd selection factor. There is now 3 zones for the runnable_load of the rq: - [0 .. (runnable_load - imbalance)]: Select the new rq which has significantly less runnable_load - [(runnable_load - imbalance) .. (runnable_load + imbalance)]: The runnable loads are close so we use load_avg to chose between the 2 rq - [(runnable_load + imbalance) .. ULONG_MAX]: Keep the current rq which has significantly less runnable_load The scale factor that is currently used for comparing runnable_load, doesn't work well with small value. As an example, the use of a scaling factor fails as soon as this_runnable_load == 0 because we always select local rq even if min_runnable_load is only 1, which doesn't really make sense because they are just the same. So instead of scaling factor, we use an absolute margin for runnable_load to detect CPUs with similar runnable_load and we keep using scaling factor for blocked load. For use case like hackbench, this enable the scheduler to select different CPUs during the fork sequence and to spread tasks across the system. Tests have been done on a Hikey board (ARM based octo cores) for several kernel. The result below gives min, max, avg and stdev values of 18 runs with each configuration. The patches depend on the "no missing update_rq_clock()" work. hackbench -P -g 1 `ea86cb4b76` `7dc603c902` v4.8 v4.8+patches min 0.049 0.050 0.051 0,048 avg 0.057 0.057(0%) 0.057(0%) 0,055(+5%) max 0.066 0.068 0.070 0,063 stdev +/-9% +/-9% +/-8% +/-9% More performance numbers here: https://lkml.kernel.org/r/20161203214707.GI20785@codeblueprint.co.uk Tested-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten.Rasmussen@arm.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: kernellwp@gmail.com Cc: umgwanakikbuti@gmail.com Cc: yuyang.du@intel.comc Link: http://lkml.kernel.org/r/1481216215-24651-3-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:10:57 +01:00
Vincent Guittot	f519a3f1c6	sched/core: Fix find_idlest_group() for fork During fork, the utilization of a task is init once the rq has been selected because the current utilization level of the rq is used to set the utilization of the fork task. As the task's utilization is still 0 at this step of the fork sequence, it doesn't make sense to look for some spare capacity that can fit the task's utilization. Furthermore, I can see perf regressions for the test: hackbench -P -g 1 because the least loaded policy is always bypassed and tasks are not spread during fork. With this patch and the fix below, we are back to same performances as for v4.8. The fix below is only a temporary one used for the test until a smarter solution is found because we can't simply remove the test which is useful for others benchmarks \| @@ -5708,13 +5708,6 @@ static int select_idle_cpu(struct task_struct p, struct sched_domain sd, int t \| \| avg_cost = this_sd->avg_scan_cost; \| \| - /* \| - * Due to large variance we need a large fuzz factor; hackbench in \| - * particularly is sensitive here. \| - */ \| - if ((avg_idle / 512) < avg_cost) \| - return -1; \| - \| time = local_clock(); \| \| for_each_cpu_wrap(cpu, sched_domain_span(sd), target, wrap) { Tested-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dietmar.eggemann@arm.com Cc: kernellwp@gmail.com Cc: umgwanakikbuti@gmail.com Cc: yuyang.du@intel.comc Link: http://lkml.kernel.org/r/1481216215-24651-2-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:10:56 +01:00
Ingo Molnar	6643aab30f	Merge branch 'linus' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:10:40 +01:00
Peter Zijlstra	11f254dbb3	x86/paravirt: Fix bool return type for PVOP_CALL() Commit: `3cded41794` ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") introduced a paravirt op with bool return type [] It turns out that the PVOP_CALL() macros miscompile when rettype is bool. Code that looked like: 83 ef 01 sub $0x1,%edi ff 15 32 a0 d8 00 callq 0xd8a032(%rip) # ffffffff81e28120 <pv_lock_ops+0x20> 84 c0 test %al,%al ended up looking like so after PVOP_CALL1() was applied: 83 ef 01 sub $0x1,%edi 48 63 ff movslq %edi,%rdi ff 14 25 20 81 e2 81 callq 0xffffffff81e28120 48 85 c0 test %rax,%rax Note how it tests the whole of %rax, even though a typical bool return function only sets %al, like: 0f 95 c0 setne %al c3 retq This is because ____PVOP_CALL() does: __ret = (rettype)__eax; and while regular integer type casts truncate the result, a cast to bool tests for any !0 value. Fix this by explicitly truncating to sizeof(rettype) before casting. [*] The actual bug should've been exposed in commit: `446f3dc8cc` ("locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests") but that didn't properly implement the paravirt call. Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `3cded41794` ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.346057680@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:09:20 +01:00
Peter Zijlstra	45dbea5f55	x86/paravirt: Fix native_patch() While chasing a regression I noticed we potentially patch the wrong code in native_patch(). If we do not select the native code sequence, we must use the default patcher, not fall-through the switch case. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alok Kataria <akataria@vmware.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel test robot <xiaolong.ye@intel.com> Fixes: `3cded41794` ("x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()") Link: http://lkml.kernel.org/r/20161208154349.270616999@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:09:19 +01:00
Ingo Molnar	6f38751510	Merge branch 'linus' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:07:13 +01:00
Andi Kleen	b0c1ef5295	perf/x86: Fix exclusion of BTS and LBR for Goldmont An earlier patch allowed enabling PT and LBR at the same time on Goldmont. However it also allowed enabling BTS and LBR at the same time, which is still not supported. Fix this by bypassing the check only for PT. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.shishkin@intel.com Cc: kan.liang@intel.com Cc: <stable@vger.kernel.org> Fixes: `ccbebba4c6` ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports it") Link: http://lkml.kernel.org/r/20161209001417.4713-1-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:06:09 +01:00
Ingo Molnar	6de75a37b8	Merge branch 'linus' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-12-11 13:05:59 +01:00
Hauke Mehrtens	ba735155b9	MIPS: Lantiq: Fix mask of GPE frequency The hardware documentation says bit 11:10 are used for the GPE frequency selection. Fix the mask in the define to match these bits. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Thomas Langer <thomas.langer@intel.com> Cc: linux-mips@linux-mips.org Cc: john@phrozen.org Patchwork: https://patchwork.linux-mips.org/patch/14648/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2016-12-11 11:20:25 +01:00
Luuk Paulussen	edb6fa1a64	MIPS: Return -ENODEV from weak implementation of rtc_mips_set_time The sync_cmos_clock function in kernel/time/ntp.c first tries to update the internal clock of the cpu by calling the "update_persistent_clock64" architecture specific function. If this returns -ENODEV, it then tries to update an external RTC using "rtc_set_ntp_time". On the mips architecture, the weak implementation of the underlying function would return 0 if it wasn't overridden. This meant that the sync_cmos_clock function would never try to update an external RTC (if both CONFIG_GENERIC_CMOS_UPDATE and CONFIG_RTC_SYSTOHC are configured) Returning -ENODEV instead, means that an external RTC will be tried. Signed-off-by: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz> Reviewed-by: Richard Laing <richard.laing@alliedtelesis.co.nz> Reviewed-by: Scott Parlane <scott.parlane@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14649/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2016-12-11 11:19:04 +01:00
WANG Cong	3111912971	e1000: use disable_hardirq() for e1000_netpoll() In commit `02cea39586` ("genirq: Provide disable_hardirq()") Peter introduced disable_hardirq() for netpoll, but it is forgotten to use it for e1000. This patch changes disable_irq() to disable_hardirq() for e1000. Reported-by: Dave Jones <davej@codemonkey.org.uk> Suggested-by: Sabrina Dubroca <sd@queasysnail.net> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:31:19 -05:00
Keller, Jacob E	0266ac4536	i40e: don't truncate match_method assignment The .match_method field is a u8, so we shouldn't be casting to a u16, and because it is only one byte, we do not need to byte swap anything. Just assign the value directly. This avoids issues on Big Endian architectures which would have byte swapped and then incorrectly truncated the value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bimmy Pujari <bimmy.pujari@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:31:19 -05:00
WingMan Kwok	6246168b4a	net: ethernet: ti: netcp: add support of cpts This patch adds support of the cpts device found in the gbe and 10gbe ethernet switches on the keystone 2 SoCs (66AK2E/L/Hx, 66AK2Gx). Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:31:19 -05:00
Timur Tabi	529ed12752	net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause Instead of having individual PHY drivers set the SUPPORTED_Pause and SUPPORTED_Asym_Pause flags, phylib itself should set those flags, unless there is a hardware erratum or other special case. During autonegotiation, the PHYs will determine whether to enable pause frame support. Pause frames are a feature that is supported by the MAC. It is the MAC that generates the frames and that processes them. The PHY can only be configured to allow them to pass through. This commit also effectively reverts the recently applied `c7a61319` ("net: phy: dp83848: Support ethernet pause frames"). So the new process is: 1) Unless the PHY driver overrides it, phylib sets the SUPPORTED_Pause and SUPPORTED_AsymPause bits in phydev->supported. This indicates that the PHY supports pause frames. 2) The MAC driver checks phydev->supported before it calls phy_start(). If (SUPPORTED_Pause \| SUPPORTED_AsymPause) is set, then the MAC driver sets those bits in phydev->advertising, if it wants to enable pause frame support. 3) When the link state changes, the MAC driver checks phydev->pause and phydev->asym_pause, If the bits are set, then it enables the corresponding features in the MAC. The algorithm is: if (phydev->pause) The MAC should be programmed to receive and honor pause frames it receives, i.e. enable receive flow control. if (phydev->pause != phydev->asym_pause) The MAC should be programmed to transmit pause frames when needed, i.e. enable transmit flow control. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:31:19 -05:00
Asbjørn Sloth Tønnesen	fba40c632c	net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_* Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:29:11 -05:00
Asbjørn Sloth Tønnesen	47c3e7783b	net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_* PPPOL2TP_MSG_* and L2TP_MSG_* are duplicates, and are being used interchangeably in the kernel, so let's standardize on L2TP_MSG_* internally, and keep PPPOL2TP_MSG_* defined in UAPI for compatibility. Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:29:11 -05:00
Asbjørn Sloth Tønnesen	41c43fbee6	net: l2tp: export debug flags to UAPI Move the L2TP_MSG_* definitions to UAPI, as it is part of the netlink API. Signed-off-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.st> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:29:11 -05:00
David S. Miller	0972770eea	Merge branch 'sxgbe-stmmac-remove-private-tx-lock' Lino Sanfilippo says: ==================== Remove private tx queue locks this patch series removes unnecessary private locks in the sxgbe and the stmmac driver. v2: - adjust commit message ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:27:02 -05:00
Lino Sanfilippo	739c8e149a	net: ethernet: stmmac: remove private tx queue lock The driver uses a private lock for synchronization of the xmit function and the xmit completion handler, but since the NETIF_F_LLTX flag is not set, the xmit function is also called with the xmit_lock held. On the other hand the completion handler uses the reverse locking order by first taking the private lock and (in case that the tx queue had been stopped) then the xmit_lock. Improve the locking by removing the private lock and using only the xmit_lock for synchronization instead. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:26:54 -05:00
Lino Sanfilippo	980f140493	net: ethernet: sxgbe: remove private tx queue lock The driver uses a private lock for synchronization of the xmit function and the xmit completion handler, but since the NETIF_F_LLTX flag is not set, the xmit function is also called with the xmit_lock held. On the other hand the completion handler uses the reverse locking order by first taking the private lock and (in case that the tx queue had been stopped) then the xmit_lock. Improve the locking by removing the private lock and using only the xmit_lock for synchronization instead. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 23:23:35 -05:00
David S. Miller	c280b48266	Merge branch 'bridge-fast-ageing-on-topology-change' Vivien Didelot says: ==================== net: bridge: fast ageing on topology change 802.1D [1] specifies that the bridges in a network must use a short value to age out dynamic entries in the Filtering Database for a period, once a topology change has been communicated by the root bridge. This patchset fixes this for the in-kernel STP implementation. Once the topology change flag is set in a net_bridge instance, the ageing time value is shorten to twice the forward delay used by the topology. When the topology change flag is cleared, the ageing time configured for the bridge is restored. To accomplish that, a new bridge_ageing_time member is added to the net_bridge structure, to store the user configured bridge ageing time. Two helpers are added to offload the ageing time and set the topology change flag in the net_bridge instance. Then the required logic is added in the topology change helper if in-kernel STP is used. This has been tested on the following topology: +--------------+ \| root bridge \| \| 1 2 3 4 \| +--+--+--+--+--+ \| \| \| \| +--------+ \| \| \| +------\| laptop \| \| \| \| +--------+ +--+--+--+-----+ \| 1 2 3 \| \| slave bridge \| +--------------+ When unplugging/replugging the laptop, the slave bridge (under test) gets the topology change flag sent by the root bridge, and fast ageing is triggered on the bridges. Once the topology change timer of the root bridge expires, the topology change flag is cleared and the configured ageing time is restored on the bridges. A similar test has been done between two bridges under test. When changing the forward delay of the root bridge with: # echo 3000 > /sys/class/net/br0/bridge/forward_delay the ageing time correctly changes on both bridges from 300s to 60s while the TOPOLOGY_CHANGE flag is present. [1] "8.3.5 Notifying topology changes", http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf No change since RFC: https://lkml.org/lkml/2016/10/19/828 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 21:28:39 -05:00
Vivien Didelot	34d8acd8aa	net: bridge: shorten ageing time on topology change 802.1D [1] specifies that the bridges must use a short value to age out dynamic entries in the Filtering Database for a period, once a topology change has been communicated by the root bridge. Add a bridge_ageing_time member in the net_bridge structure to store the bridge ageing time value configured by the user (ioctl/netlink/sysfs). If we are using in-kernel STP, shorten the ageing time value to twice the forward delay used by the topology when the topology change flag is set. When the flag is cleared, restore the configured ageing time. [1] "8.3.5 Notifying topology changes ", http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 21:28:28 -05:00
Vivien Didelot	8384b5f5b2	net: bridge: add helper to set topology change Add a __br_set_topology_change helper to set the topology change value. This can be later extended to add actions when the topology change flag is set or cleared. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 21:27:23 -05:00
Vivien Didelot	82dd4332aa	net: bridge: add helper to offload ageing time The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME switchdev attr is actually set when initializing a bridge port, and when configuring the bridge ageing time from ioctl/netlink/sysfs. Add a __set_ageing_time helper to offload the ageing time to physical switches, and add the SWITCHDEV_F_DEFER flag since it can be called under bridge lock. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 21:27:23 -05:00
Philippe Reynes	bfd8d977af	net: nicvf: use new api ethtool_{get\|set}_link_ksettings The ethtool api {get\|set}_settings is deprecated. We move this driver to new api {get\|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:31:44 -05:00
Ivan Khoronzhuk	52986a2f92	net: ethernet: ti: cpsw: sync rates for channels in dual emac mode The channels are common for both ndevs in dual emac mode. Hence, keep in sync their rates. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:29:47 -05:00
Ivan Khoronzhuk	0be01b8e0a	net: ethernet: ti: cpsw: re-split res only when speed is changed Don't re-split res in the following cases: - speed of phys is not changed - speed of phys is changed and no rate limited channels - speed of phys is changed and all channels are rate limited - phy is unlinked while dev is open - phy is linked back but speed is not changed The maximum speed is sum of "linked" phys, thus res are split taken in account two interfaces, both for dual emac mode and for switch mode. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:29:47 -05:00
Ivan Khoronzhuk	32b78d8563	net: ethernet: ti: cpsw: combine budget and weight split and check Re-split weight along with budget. It simplify code a little and update state after every rate change. Also it's necessarily to move arguments checks to this combined function. Replace maximum rate check for an interface on maximum possible rate. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:29:47 -05:00
Ivan Khoronzhuk	32b5f2d1f9	net: ethernet: ti: cpsw: don't start queue twice No need to start queues after cpsw is started as it will be done while cpsw_adjust_link(), after phy connection. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:29:47 -05:00
Ivan Khoronzhuk	cb7d78d045	net: ethernet: ti: cpsw: use same macros to get active slave Use the same, more convenient macros, to get active slave. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:29:47 -05:00
Arnd Bergmann	0e85663b88	net: mvneta: select GENERIC_ALLOCATOR We previously relied on GENERIC_ALLOCATOR to be selected by CONFIG_ARM, but now we can compile-test the driver on other architectures that don't select it: drivers/net/built-in.o: In function `mvneta_bm_remove': mvneta_bm.c:(.text+0x4ee35): undefined reference to `gen_pool_free' This adds an explicit select for the part of the driver that has the dependency. Fixes: `a0627f776a` ("net: marvell: Allow drivers to be built with COMPILE_TEST") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:27:53 -05:00
Amit Kushwaha	fa1bd57a63	net: socket: removed an unnecessary newline This patch removes a newline which was added in socket.c file in net-next Signed-off-by: Amit Kushwaha <kushwaha.a@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:27:07 -05:00
WANG Cong	efa172f428	netlink: use blocking notifier netlink_chain is called in ->release(), which is apparently a process context, so we don't have to use an atomic notifier here. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-10 17:25:58 -05:00
David S. Miller	821781a9f4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-12-10 16:21:55 -05:00
Linus Torvalds	045169816b	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - Fix pointer size when caam is used with AArch64 boot loader on AArch32 kernel. - Fix ahash state corruption in marvell driver. - Fix buggy algif_aed tag handling. - Prevent mcryptd from being used with incompatible algorithms which can cause crashes" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: algif_aead - fix uninitialized variable warning crypto: mcryptd - Check mcryptd algorithm compatibility crypto: algif_aead - fix AEAD tag memory handling crypto: caam - fix pointer size for AArch64 boot loader, AArch32 kernel crypto: marvell - Don't corrupt state of an STD req for re-stepped ahash crypto: marvell - Don't copy hash operation twice into the SRAM	2016-12-10 09:47:13 -08:00
Linus Torvalds	cd6628953e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Limit the number of can filters to avoid > MAX_ORDER allocations. Fix from Marc Kleine-Budde. 2) Limit GSO max size in netvsc driver to avoid problems with NVGRE configurations. From Stephen Hemminger. 3) Return proper error when memory allocation fails in ser_gigaset_init(), from Dan Carpenter. 4) Missing linkage undo in error paths of ipvlan_link_new(), from Gao Feng. 5) Missing necessayr SET_NETDEV_DEV in lantiq and cpmac drivers, from Florian Fainelli. 6) Handle probe deferral properly in smsc911x driver. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: mlx5: Fix Kconfig help text net: smsc911x: back out silently on probe deferrals ibmveth: set correct gso_size and gso_type net: ethernet: cpmac: Call SET_NETDEV_DEV() net: ethernet: lantiq_etop: Call SET_NETDEV_DEV() vhost-vsock: fix orphan connection reset cxgb4/cxgb4vf: Assign netdev->dev_port with port ID driver: ipvlan: Unlink the upper dev when ipvlan_link_new failed ser_gigaset: return -ENOMEM on error instead of success NET: usb: cdc_mbim: add quirk for supporting Telit LE922A can: peak: fix bad memory access and free sequence phy: Don't increment MDIO bus refcount unless it's a different owner netvsc: reduce maximum GSO size drivers: net: cpsw-phy-sel: Clear RGMII_IDMODE on "rgmii" links can: raw: raw_setsockopt: limit number of can_filter that can be set	2016-12-10 09:23:19 -08:00
Christopher Covington	d33695fbfa	net: mlx5: Fix Kconfig help text Since the following commit, Infiniband and Ethernet have not been mutually exclusive. Fixes: `4aa17b28` mlx5: Enable mutual support for IB and Ethernet Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 23:08:32 -05:00
Eric Dumazet	3174fed982	net: skb_condense() can also deal with empty skbs It seems attackers can also send UDP packets with no payload at all. skb_condense() can still be a win in this case. It will be possible to replace the custom code in tcp_add_backlog() to get full benefit from skb_condense() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 23:06:10 -05:00
Linus Walleij	ab4e4c07ac	net: smsc911x: back out silently on probe deferrals When trying to get a regulator we may get deferred and we see this noise: smsc911x 1b800000.ethernet-ebi2 (unnamed net_device) (uninitialized): couldn't get regulators -517 Then the driver continues anyway. Which means that the regulator may not be properly retrieved and reference counted, and may be switched off in case noone else is using it. Fix this by returning silently on deferred probe and let the system work it out. Cc: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 23:05:16 -05:00
David S. Miller	5ac9efbe1c	Three fixes: * fix a logic bug introduced by a previous cleanup * fix nl80211 attribute confusing (trying to use a single attribute for two purposes) * fix a long-standing BSS leak that happens when an association attempt is abandoned -----BEGIN PGP SIGNATURE----- iQIcBAABCgAGBQJYSpxnAAoJEGt7eEactAAd3hEP/0RzU5BLTe3FD39i2ESo4fQo q2Wnaa+ES1Ul473rCuSmPLGzlSjh0GciltHXRu7UEf5zXAjwuQtilrKsI9DizVR8 hgTV4Jp0TDLuDudgxEPlpLxcFWALDaK0AlKuL1dY/FSI1BnNnToEeX8Bum6/otqe 2wLQ11+70HrdNHJjvBEHP/kE/2D55easydmkCS30WYlFrd0BEFtGZ6Leb8deIAzL qQpanf26jBYVTm7ls+j0bt4mYbb0RLcsLrOS8EgyIYhCsbJHbaC2OpYGTbGxR6ob KKx01PGVnzytaKXCx/m70923V2mwWZWwa7IgDfoj2IzvsTnfmCgekGdSCiY+DJjE 1jiDYWVK3KgTJQqXRnE1BCbF/FPK6ABKoPgmJBAAiLC48VpmrQwG0OLLQmYVTdp9 KLrQztvZAVV1adA32fGpJHecDyQMMZ2xp7TZn9YY3qAiP4APU8IUscKuSXALmKN9 kMBUBhwkk7QuHZXkry0QFBpFXpOgYjX3vt/gBh8EAmGfyRIklTKtGsmftkuQbWR9 9BN4TbPznEJECqVy/BCL8llHNkfsJgcz3noFOePUjwa4FCAxJst/NFya+IkkqOQ5 eAOj5cjsDfxsrdJFGxIsxXrtGZI1MjwKZf3w6jmu/VVL6BMryxYwtWnwrwcBsit7 nXjitThBO0V2l3Iaf09m =HvKt -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2016-12-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Three fixes: * fix a logic bug introduced by a previous cleanup * fix nl80211 attribute confusing (trying to use a single attribute for two purposes) * fix a long-standing BSS leak that happens when an association attempt is abandoned ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:59:05 -05:00
Thomas Falcon	7b5967389f	ibmveth: set correct gso_size and gso_type This patch is based on an earlier one submitted by Jon Maxwell with the following commit message: "We recently encountered a bug where a few customers using ibmveth on the same LPAR hit an issue where a TCP session hung when large receive was enabled. Closer analysis revealed that the session was stuck because the one side was advertising a zero window repeatedly. We narrowed this down to the fact the ibmveth driver did not set gso_size which is translated by TCP into the MSS later up the stack. The MSS is used to calculate the TCP window size and as that was abnormally large, it was calculating a zero window, even although the sockets receive buffer was completely empty." We rely on the Virtual I/O Server partition in a pseries environment to provide the MSS through the TCP header checksum field. The stipulation is that users should not disable checksum offloading if rx packet aggregation is enabled through VIOS. Some firmware offerings provide the MSS in the RX buffer. This is signalled by a bit in the RX queue descriptor. Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com> Reviewed-by: David Dai <zdai@us.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:47:22 -05:00
David S. Miller	524a64c726	Merge branch 'udp-receive-path-optimizations' Eric Dumazet says: ==================== udp: receive path optimizations This patch series provides about 100 % performance increase under flood. v2: added Paolo feedback on udp_rmem_release() for tiny sk_rcvbuf added the last patch touching sk_rmem_alloc later ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:12:30 -05:00
Eric Dumazet	02ab0d139c	udp: udp_rmem_release() should touch sk_rmem_alloc later In flood situations, keeping sk_rmem_alloc at a high value prevents producers from touching the socket. It makes sense to lower sk_rmem_alloc only at the end of udp_rmem_release() after the thread draining receive queue in udp_recvmsg() finished the writes to sk_forward_alloc. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:12:21 -05:00
Eric Dumazet	6b229cf77d	udp: add batching to udp_rmem_release() If udp_recvmsg() constantly releases sk_rmem_alloc for every read packet, it gives opportunity for producers to immediately grab spinlocks and desperatly try adding another packet, causing false sharing. We can add a simple heuristic to give the signal by batches of ~25 % of the queue capacity. This patch considerably increases performance under flood by about 50 %, since the thread draining the queue is no longer slowed by false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:12:21 -05:00
Eric Dumazet	c84d949057	udp: copy skb->truesize in the first cache line In UDP RX handler, we currently clear skb->dev before skb is added to receive queue, because device pointer is no longer available once we exit from RCU section. Since this first cache line is always hot, lets reuse this space to store skb->truesize and thus avoid a cache line miss at udp_recvmsg()/udp_skb_destructor time while receive queue spinlock is held. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:12:21 -05:00
Eric Dumazet	4b272750db	udp: add busylocks in RX path Idea of busylocks is to let producers grab an extra spinlock to relieve pressure on the receive_queue spinlock shared by consumer. This behavior is requested only once socket receive queue is above half occupancy. Under flood, this means that only one producer can be in line trying to acquire the receive_queue spinlock. These busylock can be allocated on a per cpu manner, instead of a per socket one (that would consume a cache line per socket) This patch considerably improves UDP behavior under stress, depending on number of NIC RX queues and/or RPS spread. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:12:21 -05:00
David S. Miller	d96dac1454	Merge branch 'qcom-emac' Timur Tabi says: ==================== net: qcom/emac: simplify support for different SOCs On SOCs that have the Qualcomm EMAC network controller, the internal PHY block is always different. Sometimes the differences are small, sometimes it might be a completely different IP. Either way, using version numbers to differentiate them and putting all of the init code in one file does not scale. This patchset does two things: The first breaks up the current code into different files, and the second patch adds support for a third SOC, the Qualcomm Technologies QDF2400 ARM Server SOC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:11:15 -05:00
Timur Tabi	a51f404723	net: qcom/emac: add support for the Qualcomm Technologies QDF2400 The QDF2432 and the QDF2400 have slightly different internal PHYs, so there are some programming differences. Some of the registers in the QDF2400 have moved, and some registers require different values during initialization. Because of the differences, and because HIDs are a scare resource, the ACPI tables specify the hardware version in an _HRV property. Version 1 is the QDF2432, and version 2 is the QDF2400. Any future SOC that has the same internal PHY but different programming requirements will be assigned the next available version number. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:11:02 -05:00
Timur Tabi	1e88ab6fbb	net: qcom/emac: move phy init code to separate files The internal PHY of the EMAC differs on each SOC, and the list will only continue to grow. By separating the code into individual files, we can add support for more SOCs more cleanly. Note: The internal PHY is also sometimes called the SGMII device. We also stop referring to the various PHY variations by version number, so no more "v2", "v3", etc. Instead, the devices are named after the SOC they are, which is in sync with the device tree property names. Future patches will probably rearrange more code among the files. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-09 22:11:02 -05:00
Thomas Gleixner	990e9dc381	x86/ldt: Make all size computations unsigned ldt->size can never be negative. The helper functions take 'unsigned int' arguments which are assigned from ldt->size. The related user space user_desc struct member entry_number is unsigned as well. But ldt->size itself and a few local variables which are related to ldt->size are type 'int' which makes no sense whatsoever and results in typecasts which make the eyes bleed. Clean it up and convert everything which is related to ldt->size to unsigned it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dan Carpenter <dan.carpenter@oracle.com>	2016-12-10 00:24:39 +01:00

... 3 4 5 6 7 ...

638242 commits