Commit Graph

615248 Commits

Author SHA1 Message Date
Pablo Neira Ayuso 0ed6389c48 netfilter: nf_tables: rename set implementations
Use nft_set_* prefix for backend set implementations, thus we can use
nft_hash for the new hash expression.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 00:44:37 +02:00
Florian Westphal a6c46d9bc9 ipvs: use nf_ct_kill helper
Once timer is removed from nf_conn struct we cannot open-code
the removal sequence anymore.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 00:43:52 +02:00
Florian Westphal d0b35b93d4 netfilter: use_nf_conn_expires helper in more places
... so we don't need to touch all of these places when we get rid of the
timer in nf_conn.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 00:43:13 +02:00
Liping Zhang 9f7c824a44 netfilter: nf_dup4: remove redundant checksum recalculation
IP header checksum will be recalculated at ip_local_out, so
there's no need to calculated it here, remove it. Also update
code comments to illustrate it, and delete the misleading
comments about checksum recalculation.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 00:42:47 +02:00
Hangbin Liu ceee4091d6 netfilter: physdev: add missed blank
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 00:42:14 +02:00
Gao Feng e5e693ab49 netfilter: conntrack: Only need first 4 bytes to get l4proto ports
We only need first 4 bytes instead of 8 bytes to get the ports of
tcp/udp/dccp/sctp/udplite in their pkt_to_tuple function.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-08-12 00:41:08 +02:00
Philippe Reynes f08aff444a net: ethernet: renesas: sh_eth: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 23:14:53 -07:00
Philippe Reynes 9fd0375ad3 net: ethernet: renesas: sh_eth: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy_dev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 23:14:53 -07:00
Adam Barth 05b8ad25bc samples/bpf: fix bpf_perf_event_output prototype
The commit 555c8a8623 ("bpf: avoid stack copy and use skb ctx for event output")
started using 20 of initially reserved upper 32-bits of 'flags' argument
in bpf_perf_event_output(). Adjust corresponding prototype in samples/bpf/bpf_helpers.h

Signed-off-by: Adam Barth <arb@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 23:12:31 -07:00
Harini Katakam fff8019a08 net: macb: Add 64 bit addressing support for GEM
This patch adds support for 64 bit addressing and BDs.
-> Enable 64 bit addressing in DMACFG register.
-> Set DMA mask when design config register shows support for 64 bit addr.
-> Add new BD words for higher address when 64 bit DMA support is present.
-> Add and update TBQPH and RBQPH for MSB of BD pointers.
-> Change extraction and updation of buffer addresses to use
64 bit address.
-> In gem_rx extract address in one place insted of two and use a
separate flag for RXUSED.

Signed-off-by: Harini Katakam <harinik@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:38:15 -07:00
Sudarsana Reddy Kalluru 054c67d1c8 qed*: Add support for ethtool link_ksettings callbacks.
This patch adds the driver implementation for ethtool link_ksettings
callbacks. qed driver now defines/uses the qed specific masks for
representing link capability values. qede driver maps these values to
to new link modes defined by the kernel implementation of link_ksettings.

Please consider applying this to 'net-next' branch.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:36:02 -07:00
David S. Miller e27d6cf55e Merge branch 'cpsw-refactor'
Ivan Khoronzhuk says:

====================
net: ethernet: ti: cpsw: split driver data and per ndev data

In dual_emac mode the driver can handle 2 network devices. Each of them can use
its own private data and common data/resources. This patchset splits common driver
data/resources and private per net device data.
It leads to:
- reduce memory usage
- increase code readability
- allows add a bunch of simplification
- create prerequisites to add multi-channel support,
  when channels are shared between net devices

Doesn't have bad impact on performance.
v2: https://lkml.org/lkml/2016/8/6/108

Since v2:
- removed patch:
  net: ethernet: ti: cpsw: fix int dbg message
- replaced patch:
  "net: ethernet: ti: cpsw: remove redundant check in napi poll"
  on "net: ethernet: ti: cpsw: remove intr dbg msg from poll handlers"
- removed macro "cpsw_get_slave_ndev"
- corrected some commits

Since v1:
- added several patch improvements
- avoided variable reordering in structures
- removed static variable for common function
- split big patch on several patches:
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:41 -07:00
Ivan Khoronzhuk 2a05a622d8 net: ethernet: ti: cpsw: move ale, cpts and drivers params under cpsw_common
The ale, cpts, version, rx_packet_max, bus_freq, interrupt pacing
parameters are common per net device that uses the same h/w. So,
move them to common driver structure.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:41 -07:00
Ivan Khoronzhuk dbc4ec522d net: ethernet: ti: cpsw: move napi struct to cpsw_common
The napi structs are common for both net devices in dual_emac
mode, In order to not hold duplicate links to them, move to
cpsw_common.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:40 -07:00
Ivan Khoronzhuk 606f399395 net: ethernet: ti: cpsw: move platform data and slaves info to cpsw_common
These data are common for net devs in dual_emac mode. No need to hold
it for every priv instance, so move them under cpsw_common.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:40 -07:00
Ivan Khoronzhuk e38b5a3db8 net; ethernet: ti: cpsw: move irq stuff under cpsw_common
The irq data are common for net devs in dual_emac mode. So no need to
hold these data in every priv struct, move them under cpsw_common.
Also delete irq_num var, as after optimization it's not needed.
Correct number of irqs to 2, as anyway, driver is using only 2,
at least for now.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:40 -07:00
Ivan Khoronzhuk 2c836bd9a2 net: ethernet: ti: cpsw: move cpdma resources to cpsw_common
Every net device private struct holds links to shared cpdma resources.
No need to save and every time synchronize these resources per net dev.
So, move it to common driver struct.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:40 -07:00
Ivan Khoronzhuk 5d8d0d4d46 net: ethernet: ti: cpsw: move links on h/w registers to cpsw_common
The pointers on h/w registers are common for every cpsw_private
instance, so no need to hold them for every ndev.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:40 -07:00
Ivan Khoronzhuk 56e31bd893 net: ethernet: ti: cpsw: replace pdev on dev
No need to hold pdev link when only dev is needed.
This allows to simplify a bunch of cpsw->pdev->dev now and farther.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:39 -07:00
Ivan Khoronzhuk 649a1688c9 net: ethernet: ti: cpsw: create common struct to hold shared driver data
This patch simply create holder for common data and as a start moves
pdev var to it.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:39 -07:00
Ivan Khoronzhuk 82b52104a3 net: ethernet: ti: cpsw: don't check slave num in runtime
No need to check const slave num in runtime for every packet,
and ndev for slaves w/o ndev is anyway NULL. So remove redundant
check and macro.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:39 -07:00
Ivan Khoronzhuk ef4183a1d7 net: ethernet: ti: cpsw: remove clk var from priv
There is no need to hold link to clk, it's used only once
while probe.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:39 -07:00
Ivan Khoronzhuk 6f1f58361f net: ethernet: ti: cpsw: remove priv from cpsw_get_slave_port() parameters list
There is no need in priv here.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:39 -07:00
Ivan Khoronzhuk 0a440f8f4f net: ethernet: ti: cpsw: remove intr dbg msg from poll handlers
At poll handler no possibility to figure out which network device is
handling packets, as cpdma channels are common for both network
devices in dual_emac mode. Currently, the messages are printed only
for one device, in fact, there is two. This print msg is incorrect
and seems is not very useful, so drop it from poll handler.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:38 -07:00
Ivan Khoronzhuk 27e9e10391 net: ethernet: ti: cpsw: simplify submit routine
As second net dev is created only in case of dual_emac mode, port
number can be figured out in simpler way. Also no need to pass
redundant ndev struct.

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:27:38 -07:00
Gao Feng ab10dccb11 rps: Inspect PPTP encapsulated by GRE to get flow hash
The PPTP is encapsulated by GRE header with that GRE_VERSION bits
must contain one. But current GRE RPS needs the GRE_VERSION must be
zero. So RPS does not work for PPTP traffic.

In my test environment, there are four MIPS cores, and all traffic
are passed through by PPTP. As a result, only one core is 100% busy
while other three cores are very idle. After this patch, the usage
of four cores are balanced well.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:22:14 -07:00
David S. Miller 084c9535aa Merge branch 'qdisc-hashtable'
Jiri Kosina says:

====================
Convert qdisc linked list into a hashtable

This is a respin of the v6 of the original patch [1], split into two-patch
series as requested by davem; first patch fixes all symbol conflicts
that'd happen once netdevice.h starts to include hashtable.h, the second
one performs the actual switch to hashtable.

I've preserved Cong's Reviewed-by:, as code-wise this series is identical
to the original v6 of the patch.

[1] lkml.kernel.org/r/alpine.LNX.2.00.1608011220580.22028@cbobk.fhfr.pm
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:19:07 -07:00
Jiri Kosina 59cc1f61f0 net: sched: convert qdisc linked list to hashtable
Convert the per-device linked list into a hashtable. The primary
motivation for this change is that currently, we're not tracking all the
qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
performed over the linked list by qdisc_match_from_root() is rather
expensive.

The ultimate goal is to get rid of hidden qdiscs completely, which will
bring much more determinism in user experience.

Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:19:02 -07:00
Jiri Kosina e87a8f24c9 net: resolve symbol conflicts with generic hashtable.h
This is a preparatory patch for converting qdisc linked list into a
hashtable. As we'll need to include hashtable.h in netdevice.h, we first
have to make sure that this will not introduce symbol conflicts for any of
the netdevice.h users.

Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:18:52 -07:00
Niklas Söderlund b89b815c32 ravb: use proper names for suspend/resume functions
The patch 'ravb: add sleep PM suspend/resume support' used incorrect
function names containing 'runtime' for the suspend and resume
functions.

Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 14:05:48 -07:00
Uwe Kleine-König 9c706a49d6 net: ipconfig: fix use after free
ic_close_devs() calls kfree() for all devices's ic_device. Since commit
2647cffb2b ("net: ipconfig: Support using "delayed" DHCP replies")
the active device's ic_device is still used however to print the
ipconfig summary which results in an oops if the memory is already
changed. So delay freeing until after the autoconfig results are
reported.

Fixes: 2647cffb2b ("net: ipconfig: Support using "delayed" DHCP replies")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 14:04:23 -07:00
Niklas Söderlund 0184165b2f ravb: add sleep PM suspend/resume support
The interface would not function after the system had been woken up
after have been suspended (echo mem > /sys/power/state) cycle. The
reason for this is that all device registers have been reset to its
default values. This patch adds sleep suspend and resume functions that
detached the interface at suspend and restore the registers and reattach
the interface at resume.

Only the registers that are only configured at probe time needs to be
explicitly restored by the resume handler. All other registers are
reconfigured by either reopening the device in the resume handler (if
the device was running when the system was suspended) or when the
interface is opened by a user at a later time.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09 16:16:08 -07:00
Julia Lawall 0dff88d39f net: dsa: b53: constify b53_io_ops structures
The b53_io_ops structures are never modified, so declare them as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09 15:09:40 -07:00
David Ahern 631fee7d70 net: Remove fib_local variable
After commit 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
fib_local is set but not used. Remove it.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09 14:57:39 -07:00
Guillaume Nault bb8082f691 ppp: build ifname using unit identifier for rtnl based devices
Userspace programs generally need to know the name of the ppp devices
they create. Both ioctl and rtnl interfaces use the ppp<suffix> sheme
to name them. But although the suffix used by the ioctl interface can
be known by userspace (it's the PPP unit identifier returned by the
PPPIOCGUNIT ioctl), the one used by the rtnl is only known by the
kernel.

This patch brings more consistency between ioctl and rtnl based ppp
devices by generating device names using the PPP unit identifer as
suffix in both cases. This way, userspace can always infer the name of
the devices they create.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09 14:56:21 -07:00
Nicolas Iooss 6cdaf03f8c RDS: add __printf format attribute to error reporting functions
This is helpful to detect at compile-time errors related to format
strings.

Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:16:21 -07:00
Raju Lakkaraju d50736a853 Microsemi VSC 8531/41 PHY Driver
Hello,

I added all review comments and re-sending for review.

>From a5017f5878a92d2acec86a6a29b1498c457cb73a Mon Sep 17 00:00:00 2001
From: Nagaraju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Date: Wed, 3 Aug 2016 18:28:24 +0530
Subject: [PATCH v2] net: phy: Add drivers for Microsemi PHYs

Signed-off-by: Nagaraju Lakkaraju <Raju.Lakkaraju@microsemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:15:57 -07:00
Julia Lawall 07bf2e11ad net/fsl: use of_property_read_bool
Use of_property_read_bool to check for the existence of a property.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e1,e2,x;
@@
-	if (of_get_property(e1,e2,NULL))
-		x = true;
-	else
-		x = false;
+	x = of_property_read_bool(e1,e2);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:15:00 -07:00
Haiyang Zhang 7f5d5af0b2 hv_netvsc: Add handler for physical link speed change
On Hyper-V host 2016 and later, VMs gets an event message of the physical
link speed when vSwitch is changed. This patch handles this message, so
the updated link speed can be reported by ethtool.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:14:07 -07:00
Haiyang Zhang b37879e6ca hv_netvsc: Add query for initial physical link speed
The physical link speed value will be reported by ethtool command.
The real speed is available from Windows 2016 host or later.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:14:07 -07:00
Grygorii Strashko aeec302104 net: ethernet: ti: cpdma: remove used_desc counter
The struct cpdma_desc_pool->used_desc field can be safely removed from
CPDMA driver (and hot patch) because used_descs counter is used just
for pool consistency check at CPDMA deinitialization and now this
check can be re-implemnted using gen_pool_size(pool->gen_pool) !=
gen_pool_avail(pool->gen_pool).
More over, this will allow to get rid of warnings in
cpdma_desc_pool_destro()-> WARN_ON(pool->used_desc) which may happen
because the used_descs is used unprotected, since CPDMA has been
switched to use genalloc, and may get wrong values on SMP.

Hence, remove used_desc from struct cpdma_desc_pool.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:12:17 -07:00
Michal Soltys 37088f617d net/sched/sch_hfsc.c: remove unused cl_myfadj
The code using this variable has been commented out in the past as it
was causing issues in upperlimited link-sharing scenarios.

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:06:47 -07:00
Michal Soltys 678a6241c6 net/sched/sch_hfsc.c: keep fsc and virtual times in sync; fix an old bug
This patch simplifies how we update fsc and calculate vt from it - while
keeping the expected functionality identical with how hfsc behaves
curently. It also fixes a certain issue introduced with
a very old patch.

The idea is, that instead of correcting cl_vt before fsc curve update
(rtsc_min) and correcting cl_vt after calculation (rtsc_y2x) to keep
cl_vt local to the current period - we can simply rely on virtual times
and curve values always being in sync - analogously to how rsc and usc
function, except that we use virtual time here.

Why hasn't it been done since the beginning this way ? The likely scenario
(basing on the code trying to correct curves whenever possible) was to
keep the virtual times as small as possible - as they have tendency to
"gallop" forward whenever their siblings and other fair sharing
subtrees are idling. On top of that, current code is subtly bugged, so
cumulative time (without any corrections) is always kept and used in
init_vf() when a new backlog period begins (using cl_cvtoff).

Is cumulative value safe ? Generally yes, though corner cases are easy
to create. For example consider:

1gbit interface
some 100kbit leaf, everything else idle

With current tick (64ns) 1s is 15625000 ticks, but the leaf is alone and
it's virtual time, so in reality it's 10000 times more. ITOW 38 bits are
needed to hold 1 second. 54 - 1 day, 59 - 1 month, 63 - 1 year (all
logarithms rounded up). It's getting somewhat dangerous, but also
requires setup excusing this kind of values not mentioning permanently
backlogged class for a year. In near most extreme case (10gbit, 10kbit
leaf), we have "enough" to hold ~13.6 days in 64 bits.

Well, the issue remains mostly theoretical and cl_cvtoff has been
working fine for all those years. Sensible configuration are de-facto
immune to this issue, and not so sensible can solve it with a cronjob
and its period inversely proportional to the insanity of such setup =)

Now let's explain the subtle bug mentioned earlier.

The issue is related to how offsets are kept and how we calculate
virtual times and update fair service curve(s). The issue itself is
subtle, but easy to observe with long m1 segments. It was introduced in
rather old patch:

Commit 99296150c7: "[NET_SCHED]: O(1) children vtoff adjustment
in HFSC scheduler"

(available in git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git)

Originally when a new backlog period was started, cl_vtoff of each
sibling was updated with cl_cvtmax from past period - naturally moving
all cl_vt to proper starting point. That patch adjusted it so cumulative
offset is kept in the parent, and there is no need for traversing the
list (as any subsequent child activation derives new vt from already
active sibling(s)).

But with this change, cl_vtoff (of each sibling) is no longer persistent
across the inactivity periods, as it's calculated from parent's
cl_cvtoff on a new backlog period, conflicting with the following curve
correction from the previous period:

if (cl->cl_virtual.x == vt) {
        cl->cl_virtual.x -= cl->cl_vtoff;
	cl->cl_vtoff = 0;
}

This essentially tries to keep curve as if it was local to the period
and resets cl_vtoff (cumulative vt offset of the class) to 0 when
possible (read: when we have an intersection or if a new curve is below
the old one). But then it's recalculated from cl_cvtoff on next active
period.  Then rtsc_min() call preceding the above if() doesn't really
do what we expect it to do in such scenario - as it calculates the
minimum of corrected curve (from the previous backlog period) and the
new uncorrected curve (with offset derived from cl_cvtoff).

Example:

tc class add dev $ife parent 1:0 classid 1:1  hfsc ls m2 100mbit ul m2 100mbit
tc class add dev $ife parent 1:1 classid 1:10 hfsc ls m1 80mbit d 10s m2 20mbit
tc class add dev $ife parent 1:1 classid 1:11 hfsc ls m2 20mbit

start B, keep it backlogged, let it run 6s (30s worth of vt as A is idle)
pause B briefly to force cl_cvtoff update in parent (whole 1:1 going idle)
start A, let it run 10s
pause A briefly to force rtsc_min()

At this point we would expect A to continue at 20mbit after a brief
moment of 80mbit. But instead A will use 80mbit for full 10s again. It's
the effect of first correcting A (during 'start A'), and then - after
unpausing - calculating rtsc_min() from old corrected and new uncorrected
curve.

The patch fixes this bug and keepis vt and fsc in sync (virtual times
are cumulative, not local to the backlog period).

Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:06:47 -07:00
Wei Yongjun 0caf5b261b qed: Use DEFINE_SPINLOCK() for spinlock
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:05:16 -07:00
Hangbin Liu a052517a8f net/multicast: should not send source list records when have filter mode change
Based on RFC3376 5.1 and RFC3810 6.1

   If the per-interface listening change that triggers the new report is
   a filter mode change, then the next [Robustness Variable] State
   Change Reports will include a Filter Mode Change Record.  This
   applies even if any number of source list changes occur in that
   period.

   Old State         New State         State Change Record Sent
   ---------         ---------         ------------------------
   INCLUDE (A)       EXCLUDE (B)       TO_EX (B)
   EXCLUDE (A)       INCLUDE (B)       TO_IN (B)

So we should not send source-list change if there is a filter-mode change.

Here are two scenarios:
1. Group deleted and filter mode is EXCLUDE, which means we need send a
   TO_IN { }.
2. Not group deleted, but has pcm->crcount, which means we need send a
   normal filter-mode-change.

At the same time, if the type is ALLOW or BLOCK, and have psf->sf_crcount,
we stop add records and decrease sf_crcount directly

Reference: https://www.ietf.org/mail-archive/web/magma/current/msg01274.html

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 16:04:39 -07:00
Philippe Reynes 013ad40d37 net: ethernet: marvell: mvneta: use new api ethtool_{get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move the mvneta driver to new api {get|set}_link_ksettings.

We use the generic function phy_ethtool_get_link_ksettings,
and update old mvneta_ethtool_set_settings to the new api.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 15:42:21 -07:00
Philippe Reynes c6c022e360 net: ethernet: marvell: mvneta: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy_dev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 15:42:21 -07:00
Philippe Reynes 72582fdb92 net: ethernet: greth: use phy_ethtool_{get|set}_link_ksettings
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 15:42:21 -07:00
Philippe Reynes 65752dda4b net: ethernet: greth: use phydev from struct net_device
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 15:42:20 -07:00
Philippe Reynes 0d5704bf4e net: ethernet: octeon: use phy_ethtool_{get|set}_link_ksettings
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

There was a check on CAP_NET_ADMIN in cvm_oct_set_settings, but this
check is already done in dev_ethtool, so no need to repeat it before
calling the generic function.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-08 15:42:20 -07:00