Commit graph

547110 commits

Author SHA1 Message Date
Vivien Didelot
57a47532c4 net: dsa: fix preparation of a port STP update
Because of the default 0 value of ret in dsa_slave_port_attr_set, a
driver may return -EOPNOTSUPP from the commit phase of a STP state,
which triggers a WARN() from switchdev.

This happened on a 6185 switch which does not support hardware bridging.

Fixes: 3563606258 ("switchdev: convert STP update to switchdev attr set")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 21:36:01 -07:00
Thomas Huth
9ae6d4935e testptp: Silence compiler warnings on ppc64
When compiling Documentation/ptp/testptp.c the following compiler
warnings are printed out:

Documentation/ptp/testptp.c: In function ‘main’:
Documentation/ptp/testptp.c:367:11: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 3 has type ‘__s64’ [-Wformat=]
           event.t.sec, event.t.nsec);
           ^
Documentation/ptp/testptp.c:505:5: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 2 has type ‘__s64’ [-Wformat=]
     (pct+2*i)->sec, (pct+2*i)->nsec);
     ^
Documentation/ptp/testptp.c:507:5: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 2 has type ‘__s64’ [-Wformat=]
     (pct+2*i+1)->sec, (pct+2*i+1)->nsec);
     ^
Documentation/ptp/testptp.c:509:5: warning: format ‘%lld’ expects argument
    of type ‘long long int’, but argument 2 has type ‘__s64’ [-Wformat=]
     (pct+2*i+2)->sec, (pct+2*i+2)->nsec);

This happens because __s64 is by default defined as "long" on ppc64,
not as "long long". However, to fix these warnings, it's possible to
define the __SANE_USERSPACE_TYPES__ so that __s64 gets defined to
"long long" on ppc64, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 21:16:56 -07:00
Robb Manes
23860f103b net/mlx4: Handle return codes in mlx4_qp_attach_common
Both new_steering_entry() and existing_steering_entry() return values
based on their success or failure, but currently they fall through
silently.  This can make troubleshooting difficult, as we were unable
to tell which one of these two functions returned errors or
specifically what code was returned.  This patch remedies that
situation by passing the return codes to err, which is returned by
mlx4_qp_attach_common() itself.

This also addresses a leak in the call to mlx4_bitmap_free() as well.

Signed-off-by: Robb Manes <rmanes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 21:14:01 -07:00
Andrew Lunn
c047a1f918 dsa: mv88e6xxx: Enable forwarding for unknown to the CPU port
Frames destined to an unknown address must be forwarded to the CPU
port. Otherwise incoming ARP, dhcp leases, etc, do not work.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 21:06:51 -07:00
Pravin B Shelar
31b33dfb0a skbuff: Fix skb checksum partial check.
Earlier patch 6ae459bda tried to detect void ckecksum partial
skb by comparing pull length to checksum offset. But it does
not work for all cases since checksum-offset depends on
updates to skb->data.

Following patch fixes it by validating checksum start offset
after skb-data pointer is updated. Negative value of checksum
offset start means there is no need to checksum.

Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin <avagin@odin.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 16:48:46 -07:00
David Ahern
741a11d9e4 net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set
Wolfgang reported that IPv6 stack is ignoring oif in output route lookups:

    With ipv6, ip -6 route get always returns the specific route.

    $ ip -6 r
    2001:db8:e2::1 dev enp2s0  proto kernel  metric 256
    2001:db8:e2::/64 dev enp2s0  metric 1024
    2001:db8:e3::1 dev enp3s0  proto kernel  metric 256
    2001:db8:e3::/64 dev enp3s0  metric 1024
    fe80::/64 dev enp3s0  proto kernel  metric 256
    default via 2001:db8:e3::255 dev enp3s0  metric 1024

    $ ip -6 r get 2001:db8:e2::100
    2001:db8:e2::100 from :: dev enp2s0  src 2001:db8:e3::1  metric 0
        cache

    $ ip -6 r get 2001:db8:e2::100 oif enp3s0
    2001:db8:e2::100 from :: dev enp2s0  src 2001:db8:e3::1  metric 0
        cache

The stack does consider the oif but a mismatch in rt6_device_match is not
considered fatal because RT6_LOOKUP_F_IFACE is not set in the flags.

Cc: Wolfgang Nothdurft <netdev@linux-dude.de>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 15:01:10 -07:00
Alexander Stein
75c261b51b net sysfs: Print link speed as signed integer
Otherwise 4294967295 (MBit/s) (-1) will be printed when there is no link.
Documentation/ABI/testing/sysfs-class-net does not state if this shall be
signed or unsigned.
Also remove the now unused variable fmt_udec.

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 14:56:20 -07:00
Andrzej Hajda
4c52b1da53 bna: fix error handling
Several functions can return negative value in case of error,
so their return type should be fixed as well as type of variables
to which this value is assigned.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 13:49:53 -07:00
David S. Miller
3504bb639e Merge branch 'af_unix_MSG_PEEK'
Aaron Conole says:

====================
af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag

This patch set implements a bugfix for kernel.org bugzilla #12323, allowing
MSG_PEEK to return all queued data on the unix domain socket, not just the
data contained in a single SKB.

This is the v3 version of this patch, which includes a suggested modification
by Eric Dumazet to convert the unix_sk() conversion macro to a static inline
function. These patches are independent and can be applied separately.

This set was tested over a 24-hour period, utilizing a loop continually
executing the bugzilla issue attached python code. It was instrumented with
a pr_err_once() ([   13.798683] unix: went there at least one time).

v2->v3:
 - Added Eric Dumazet's suggestion for #define to static inline
 - Fixed an issue calling unix_state_lock() with an invalid argument

v3->v4:
 - Eliminated an XXX comment
 - Changed from goto unlock to explicit unix_state_unlock() and break
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 13:47:08 -07:00
Aaron Conole
9f389e3567 af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag
AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
is set.

This is referenced in kernel bugzilla #12323 @
https://bugzilla.kernel.org/show_bug.cgi?id=12323

As described both in the BZ and lkml thread @
http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
AF_UNIX socket only reads a single skb, where the desired effect is
to return as much skb data has been queued, until hitting the recv
buffer size (whichever comes first).

The modified MSG_PEEK path will now move to the next skb in the tree
and jump to the again: label, rather than following the natural loop
structure. This requires duplicating some of the loop head actions.

This was tested using the python socketpair python code attached to
the bugzilla issue.

Signed-off-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 13:47:08 -07:00
Aaron Conole
4613012db1 af_unix: Convert the unix_sk macro to an inline function for type safety
As suggested by Eric Dumazet this change replaces the
#define with a static inline function to enjoy
complaints by the compiler when misusing the API.

Signed-off-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-29 13:47:07 -07:00
Denys Vlasenko
2103d6b818 net: sctp: Don't use 64 kilobyte lookup table for four elements
Seemingly innocuous sctp_trans_state_to_prio_map[] array
is way bigger than it looks, since
"[SCTP_UNKNOWN] = 2" expands into "[0xffff] = 2" !

This patch replaces it with switch() statement.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
CC: linux-sctp@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:52:21 -07:00
Alexander Couzens
06a15f51cf l2tp: protect tunnel->del_work by ref_count
There is a small chance that tunnel_free() is called before tunnel->del_work scheduled
resulting in a zero pointer dereference.

Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
Acked-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:39:10 -07:00
Ivan Mikhaylov
661dfc65f7 net/ibm/emac: bump version numbers for correct work with ethtool
The size of the MAC register dump used to be the size specified by the
reg property in the device tree.  Userland has no good way of finding
out that size, and it was not specified consistently for each MAC type,
so ethtool would end up printing junk at the end of the register dump
if the device tree didn't match the size it assumed.

Using the new version numbers indicates unambiguously that the size of
the MAC register dump is dependent only on the MAC type.

Fixes: 5369c71f7c ("net/ibm/emac: fix size of emac dump memory areas")
Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 22:22:50 -07:00
David S. Miller
51359bfc5b Merge branch 'sctp-accept-deadlock'
Karl Heiss says:

====================
sctp: Fix SCTP deadlock

These patches fix a deadlock during accept() of an SCTP connection.

The first patch fixes whitespace issues.

The second patch actually fixes the deadlock race.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 21:03:40 -07:00
Karl Heiss
635682a144 sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
A case can occur when sctp_accept() is called by the user during
a heartbeat timeout event after the 4-way handshake.  Since
sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
the listening socket but released with the new association socket.
The result is a deadlock on any future attempts to take the listening
socket lock.

Note that this race can occur with other SCTP timeouts that take
the bh_lock_sock() in the event sctp_accept() is called.

 BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
 ...
 RIP: 0010:[<ffffffff8152d48e>]  [<ffffffff8152d48e>] _spin_lock+0x1e/0x30
 RSP: 0018:ffff880028323b20  EFLAGS: 00000206
 RAX: 0000000000000002 RBX: ffff880028323b20 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff880028323be0 RDI: ffff8804632c4b48
 RBP: ffffffff8100bb93 R08: 0000000000000000 R09: 0000000000000000
 R10: ffff880610662280 R11: 0000000000000100 R12: ffff880028323aa0
 R13: ffff8804383c3880 R14: ffff880028323a90 R15: ffffffff81534225
 FS:  0000000000000000(0000) GS:ffff880028320000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
 CR2: 00000000006df528 CR3: 0000000001a85000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process swapper (pid: 0, threadinfo ffff880616b70000, task ffff880616b6cab0)
 Stack:
 ffff880028323c40 ffffffffa01c2582 ffff880614cfb020 0000000000000000
 <d> 0100000000000000 00000014383a6c44 ffff8804383c3880 ffff880614e93c00
 <d> ffff880614e93c00 0000000000000000 ffff8804632c4b00 ffff8804383c38b8
 Call Trace:
 <IRQ>
 [<ffffffffa01c2582>] ? sctp_rcv+0x492/0xa10 [sctp]
 [<ffffffff8148c559>] ? nf_iterate+0x69/0xb0
 [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8148c716>] ? nf_hook_slow+0x76/0x120
 [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
 [<ffffffff8149757d>] ? ip_local_deliver_finish+0xdd/0x2d0
 [<ffffffff81497808>] ? ip_local_deliver+0x98/0xa0
 [<ffffffff81496ccd>] ? ip_rcv_finish+0x12d/0x440
 [<ffffffff81497255>] ? ip_rcv+0x275/0x350
 [<ffffffff8145cfeb>] ? __netif_receive_skb+0x4ab/0x750
 ...

With lockdep debugging:

 =====================================
 [ BUG: bad unlock balance detected! ]
 -------------------------------------
 CslRx/12087 is trying to release lock (slock-AF_INET) at:
 [<ffffffffa01bcae0>] sctp_generate_timeout_event+0x40/0xe0 [sctp]
 but there are no more locks to release!

 other info that might help us debug this:
 2 locks held by CslRx/12087:
 #0:  (&asoc->timers[i]){+.-...}, at: [<ffffffff8108ce1f>] run_timer_softirq+0x16f/0x3e0
 #1:  (slock-AF_INET){+.-...}, at: [<ffffffffa01bcac3>] sctp_generate_timeout_event+0x23/0xe0 [sctp]

Ensure the socket taken is also the same one that is released by
saving a copy of the socket before entering the timeout event
critical section.

Signed-off-by: Karl Heiss <kheiss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 21:03:40 -07:00
Karl Heiss
f05940e618 sctp: Whitespace fix
Fix indentation in sctp_generate_heartbeat_event.

Signed-off-by: Karl Heiss <kheiss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 21:03:40 -07:00
Mitch Williams
43ae93a93e i40e/i40evf: check for stopped admin queue
It's possible that while we are waiting for the spinlock, another
entity (that owns the spinlock) has shut down the admin queue.
If we then attempt to use the queue, we will panic.

Add a check for this condition on the receive side. This matches
an existing check on the send queue side.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 20:57:14 -07:00
Jesse Brandeburg
c4bbac3913 i40e: fix VLAN inside VXLAN
Previously to this patch, the hardware was removing
VLAN tags from the inner header of VXLAN packets.  The
hardware configuration can be changed to leave the
packet alone since that is what the linux stack
expects for this type of VLAN in VXLAN packet.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-28 20:56:58 -07:00
Andrzej Hajda
72521ea07c r8169: fix handling rtl_readphy result
The function can return negative value.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-26 22:48:32 -07:00
Andrzej Hajda
f26bf06bea net: hisilicon: fix handling platform_get_irq result
The function can return negative value.

The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/assign_signed_to_unsigned.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2046107

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-26 22:46:45 -07:00
Linus Torvalds
518a7cb698 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) When we run a tap on netlink sockets, we have to copy mmap'd SKBs
    instead of cloning them.  From Daniel Borkmann.

 2) When converting classical BPF into eBPF, fix the setting of the
    source reg to BPF_REG_X.  From Tycho Andersen.

 3) Fix igmpv3/mldv2 report parsing in the bridge multicast code, from
    Linus Lussing.

 4) Fix dst refcounting for ipv6 tunnels, from Martin KaFai Lau.

 5) Set NLM_F_REPLACE flag properly when replacing ipv6 routes, from
    Roopa Prabhu.

 6) Add some new cxgb4 PCI device IDs, from Hariprasad Shenai.

 7) Fix headroom tests and SKB leaks in ipv6 fragmentation code, from
    Florian Westphal.

 8) Check DMA mapping errors in bna driver, from Ivan Vecera.

 9) Several 8139cp bug fixes (dev_kfree_skb_any in interrupt context,
    misclearing of interrupt status in TX timeout handler, etc.) from
    David Woodhouse.

10) In tipc, reset SKB header pointer after skb_linearize(), from Erik
    Hugne.

11) Fix autobind races et al. in netlink code, from Herbert Xu with
    help from Tejun Heo and others.

12) Missing SET_NETDEV_DEV in sunvnet driver, from Sowmini Varadhan.

13) Fix various races in timewait timer and reqsk_queue_hadh_req, from
    Eric Dumazet.

14) Fix array overruns in mac80211, from Johannes Berg and Dan
    Carpenter.

15) Fix data race in rhashtable_rehash_one(), from Dmitriy Vyukov.

16) Fix race between poll_one_napi and napi_disable, from Neil Horman.

17) Fix byte order in geneve tunnel port config, from John W Linville.

18) Fix handling of ARP replies over lightweight tunnels, from Jiri
    Benc.

19) We can loop when fib rule dumps cross multiple SKBs, fix from Wilson
    Kok and Roopa Prabhu.

20) Several reference count handling bug fixes in the PHY/MDIO layer
    from Russel King.

21) Fix lockdep splat in ppp_dev_uninit(), from Guillaume Nault.

22) Fix crash in icmp_route_lookup(), from David Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  net: Fix panic in icmp_route_lookup
  net: update docbook comment for __mdiobus_register()
  ppp: fix lockdep splat in ppp_dev_uninit()
  net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
  phy: marvell: add link partner advertised modes
  net: fix net_device refcounting
  phy: add phy_device_remove()
  phy: fixed-phy: properly validate phy in fixed_phy_update_state()
  net: fix phy refcounting in a bunch of drivers
  of_mdio: fix MDIO phy device refcounting
  phy: add proper phy struct device refcounting
  phy: fix mdiobus module safety
  net: dsa: fix of_mdio_find_bus() device refcount leak
  phy: fix of_mdio_find_bus() device refcount leak
  ip6_tunnel: Reduce log level in ip6_tnl_err() to debug
  ip6_gre: Reduce log level in ip6gre_err() to debug
  fib_rules: fix fib rule dumps across multiple skbs
  bnx2x: byte swap rss_key to comply to Toeplitz specs
  net: revert "net_sched: move tp->root allocation into fw_init()"
  lwtunnel: remove source and destination UDP port config option
  ...
2015-09-26 06:01:33 -04:00
David Ahern
bdb06cbf77 net: Fix panic in icmp_route_lookup
Andrey reported a panic:

[ 7249.865507] BUG: unable to handle kernel pointer dereference at 000000b4
[ 7249.865559] IP: [<c16afeca>] icmp_route_lookup+0xaa/0x320
[ 7249.865598] *pdpt = 0000000030f7f001 *pde = 0000000000000000
[ 7249.865637] Oops: 0000 [#1]
...
[ 7249.866811] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.3.0-999-generic #201509220155
[ 7249.866876] Hardware name: MSI MS-7250/MS-7250, BIOS 080014  08/02/2006
[ 7249.866916] task: c1a5ab00 ti: c1a52000 task.ti: c1a52000
[ 7249.866949] EIP: 0060:[<c16afeca>] EFLAGS: 00210246 CPU: 0
[ 7249.866981] EIP is at icmp_route_lookup+0xaa/0x320
[ 7249.867012] EAX: 00000000 EBX: f483ba48 ECX: 00000000 EDX: f2e18a00
[ 7249.867045] ESI: 000000c0 EDI: f483ba70 EBP: f483b9ec ESP: f483b974
[ 7249.867077]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 7249.867108] CR0: 8005003b CR2: 000000b4 CR3: 36ee07c0 CR4: 000006f0
[ 7249.867141] Stack:
[ 7249.867165]  320310ee 00000000 00000042 320310ee 00000000 c1aeca00
f3920240 f0c69180
[ 7249.867268]  f483ba04 f855058b a89b66cd f483ba44 f8962f4b 00000000
e659266c f483ba54
[ 7249.867361]  8004753c f483ba5c f8962f4b f2031140 000003c1 ffbd8fa0
c16b0e00 00000064
[ 7249.867448] Call Trace:
[ 7249.867494]  [<f855058b>] ? e1000_xmit_frame+0x87b/0xdc0 [e1000e]
[ 7249.867534]  [<f8962f4b>] ? tcp_in_window+0xeb/0xb10 [nf_conntrack]
[ 7249.867576]  [<f8962f4b>] ? tcp_in_window+0xeb/0xb10 [nf_conntrack]
[ 7249.867615]  [<c16b0e00>] ? icmp_send+0xa0/0x380
[ 7249.867648]  [<c16b102f>] icmp_send+0x2cf/0x380
[ 7249.867681]  [<f89c8126>] nf_send_unreach+0xa6/0xc0 [nf_reject_ipv4]
[ 7249.867714]  [<f89cd0da>] reject_tg+0x7a/0x9f [ipt_REJECT]
[ 7249.867746]  [<f88c29a7>] ipt_do_table+0x317/0x70c [ip_tables]
[ 7249.867780]  [<f895e0a6>] ? __nf_conntrack_find_get+0x166/0x3b0
[nf_conntrack]
[ 7249.867838]  [<f895eea8>] ? nf_conntrack_in+0x398/0x600 [nf_conntrack]
[ 7249.867889]  [<f84c0035>] iptable_filter_hook+0x35/0x80 [iptable_filter]
[ 7249.867933]  [<c16776a1>] nf_iterate+0x71/0x80
[ 7249.867970]  [<c1677715>] nf_hook_slow+0x65/0xc0
[ 7249.868002]  [<c1681811>] __ip_local_out_sk+0xc1/0xd0
[ 7249.868034]  [<c1680f30>] ? ip_forward_options+0x1a0/0x1a0
[ 7249.868066]  [<c1681836>] ip_local_out_sk+0x16/0x30
[ 7249.868097]  [<c1684054>] ip_send_skb+0x14/0x80
[ 7249.868129]  [<c16840f4>] ip_push_pending_frames+0x34/0x40
[ 7249.868163]  [<c16844a2>] ip_send_unicast_reply+0x282/0x310
[ 7249.868196]  [<c16a0863>] tcp_v4_send_reset+0x1b3/0x380
[ 7249.868227]  [<c16a1b63>] tcp_v4_rcv+0x323/0x990
[ 7249.868257]  [<c16776a1>] ? nf_iterate+0x71/0x80
[ 7249.868289]  [<c167dc2b>] ip_local_deliver_finish+0x8b/0x230
[ 7249.868322]  [<c167df4c>] ip_local_deliver+0x4c/0xa0
[ 7249.868353]  [<c167dba0>] ? ip_rcv_finish+0x390/0x390
[ 7249.868384]  [<c167d88c>] ip_rcv_finish+0x7c/0x390
[ 7249.868415]  [<c167e280>] ip_rcv+0x2e0/0x420
...

Prior to the VRF change the oif was not set in the flow struct, so the
VRF support should really have only added the vrf_master_ifindex lookup.

Fixes: 613d09b30f ("net: Use VRF device index for lookups on TX")
Cc: Andrey Melnikov <temnota.am@gmail.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-25 21:44:02 -07:00
Russell King
59f069789c net: update docbook comment for __mdiobus_register()
Update the docbook comment for __mdiobus_register() to include the new
module owner argument.  This resolves a warning found by the 0-day
builder.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-25 21:37:19 -07:00
Linus Torvalds
d4a748a10e Merge branch 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull another cgroup fix from Tejun Heo:
 "The cgroup writeback support got inadvertently enabled for traditional
  hierarchies revealing two regressions which are currently being worked
  on.  It shouldn't have been enabled on traditional hierarchies, so
  disable it on them.  This is enough to make the regressions go away
  for people who aren't experimenting with cgroup"

* 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup, writeback: don't enable cgroup writeback on traditional hierarchies
2015-09-25 16:20:55 -07:00
Guillaume Nault
58a89ecaca ppp: fix lockdep splat in ppp_dev_uninit()
ppp_dev_uninit() locks all_ppp_mutex while under rtnl mutex protection.
ppp_create_interface() must then lock these mutexes in that same order
to avoid possible deadlock.

[  120.880011] ======================================================
[  120.880011] [ INFO: possible circular locking dependency detected ]
[  120.880011] 4.2.0 #1 Not tainted
[  120.880011] -------------------------------------------------------
[  120.880011] ppp-apitest/15827 is trying to acquire lock:
[  120.880011]  (&pn->all_ppp_mutex){+.+.+.}, at: [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
[  120.880011]
[  120.880011] but task is already holding lock:
[  120.880011]  (rtnl_mutex){+.+.+.}, at: [<ffffffff812e4255>] rtnl_lock+0x12/0x14
[  120.880011]
[  120.880011] which lock already depends on the new lock.
[  120.880011]
[  120.880011]
[  120.880011] the existing dependency chain (in reverse order) is:
[  120.880011]
[  120.880011] -> #1 (rtnl_mutex){+.+.+.}:
[  120.880011]        [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e
[  120.880011]        [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
[  120.880011]        [<ffffffff812e4255>] rtnl_lock+0x12/0x14
[  120.880011]        [<ffffffff812d9d94>] register_netdev+0x11/0x27
[  120.880011]        [<ffffffffa0147b17>] ppp_ioctl+0x289/0xc98 [ppp_generic]
[  120.880011]        [<ffffffff8113b367>] do_vfs_ioctl+0x4ea/0x532
[  120.880011]        [<ffffffff8113b3fd>] SyS_ioctl+0x4e/0x7d
[  120.880011]        [<ffffffff813ad7d7>] entry_SYSCALL_64_fastpath+0x12/0x6f
[  120.880011]
[  120.880011] -> #0 (&pn->all_ppp_mutex){+.+.+.}:
[  120.880011]        [<ffffffff8107334e>] __lock_acquire+0xb07/0xe76
[  120.880011]        [<ffffffff81073a6f>] lock_acquire+0xcf/0x10e
[  120.880011]        [<ffffffff813ab18a>] mutex_lock_nested+0x56/0x341
[  120.880011]        [<ffffffffa0145f56>] ppp_dev_uninit+0x64/0xb0 [ppp_generic]
[  120.880011]        [<ffffffff812d5263>] rollback_registered_many+0x19e/0x252
[  120.880011]        [<ffffffff812d5381>] rollback_registered+0x29/0x38
[  120.880011]        [<ffffffff812d53fa>] unregister_netdevice_queue+0x6a/0x77
[  120.880011]        [<ffffffffa0146a94>] ppp_release+0x42/0x79 [ppp_generic]
[  120.880011]        [<ffffffff8112d9f6>] __fput+0xec/0x192
[  120.880011]        [<ffffffff8112dacc>] ____fput+0x9/0xb
[  120.880011]        [<ffffffff8105447a>] task_work_run+0x66/0x80
[  120.880011]        [<ffffffff81001801>] prepare_exit_to_usermode+0x8c/0xa7
[  120.880011]        [<ffffffff81001900>] syscall_return_slowpath+0xe4/0x104
[  120.880011]        [<ffffffff813ad931>] int_ret_from_sys_call+0x25/0x9f
[  120.880011]
[  120.880011] other info that might help us debug this:
[  120.880011]
[  120.880011]  Possible unsafe locking scenario:
[  120.880011]
[  120.880011]        CPU0                    CPU1
[  120.880011]        ----                    ----
[  120.880011]   lock(rtnl_mutex);
[  120.880011]                                lock(&pn->all_ppp_mutex);
[  120.880011]                                lock(rtnl_mutex);
[  120.880011]   lock(&pn->all_ppp_mutex);
[  120.880011]
[  120.880011]  *** DEADLOCK ***

Fixes: 8cb775bc0a ("ppp: fix device unregistration upon netns deletion")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-25 12:38:11 -07:00
Sudip Mukherjee
21343ac21e net: via/Kconfig: GENERIC_PCI_IOMAP required if PCI not selected
The builds of allmodconfig of avr32 is failing with:

drivers/net/ethernet/via/via-rhine.c:1098:2: error: implicit declaration
of function 'pci_iomap' [-Werror=implicit-function-declaration]
drivers/net/ethernet/via/via-rhine.c:1119:2: error: implicit declaration
of function 'pci_iounmap' [-Werror=implicit-function-declaration]

The generic empty pci_iomap and pci_iounmap is used only if CONFIG_PCI
is not defined and CONFIG_GENERIC_PCI_IOMAP is defined.

Add GENERIC_PCI_IOMAP in the dependency list for VIA_RHINE as we are
getting build failure when CONFIG_PCI and CONFIG_GENERIC_PCI_IOMAP both
are not defined.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-25 12:36:58 -07:00
Russell King
357cd64c18 phy: marvell: add link partner advertised modes
Read the standard link partner advertisment registers and store it in
phydev->lp_advertising, so ethtool can report this information to
userspace via ethtool.  Zero it as per genphy if autonegotiation is
disabled.  Tested with a Marvell 88E1512 PHY.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-25 12:23:47 -07:00
Linus Torvalds
03e8f64486 Merge branch 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "This is an assorted set I've been queuing up:

  Jeff Mahoney tracked down a tricky one where we ended up starting IO
  on the wrong mapping for special files in btrfs_evict_inode.  A few
  people reported this one on the list.

  Filipe found (and provided a test for) a difficult bug in reading
  compressed extents, and Josef fixed up some quota record keeping with
  snapshot deletion.  Chandan killed off an accounting bug during DIO
  that lead to WARN_ONs as we freed inodes"

* 'for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: keep dropped roots in cache until transaction commit
  Btrfs: Direct I/O: Fix space accounting
  btrfs: skip waiting on ordered range for special files
  Btrfs: fix read corruption of compressed and shared extents
  Btrfs: remove unnecessary locking of cleaner_mutex to avoid deadlock
  Btrfs: don't initialize a space info as full to prevent ENOSPC
2015-09-25 12:08:41 -07:00
Linus Torvalds
101688f534 NFS client bugfixes for Linux 4.3
Highlights include:
 
 Stable patches:
 - fix v4.2 SEEK on files over 2 gigs
 - Fix a layout segment reference leak when pNFS I/O falls back to inband I/O.
 - Fix recovery of recalled read delegations
 
 Bugfixes:
 - Fix a case where NFSv4 fails to send CLOSE after a server reboot
 - Fix sunrpc to wait for connections to complete before retrying
 - Fix sunrpc races between transport connect/disconnect and shutdown
 - Fix an infinite loop when layoutget fail with BAD_STATEID
 - nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
 - Fix a bogus WARN_ON_ONCE() in O_DIRECT when layout commit_through_mds is set
 - Fix layoutreturn/close ordering issues.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBWKdAAoJEGcL54qWCgDy2rUP/iIWUQSpUPfKKw7xquQUQe4j
 ci4nFxpJ/zhKj1u7x3wrkxZAXcEooYo+ZJ7ayzROKcfQL/sUSWGbSLdr3mqrQynv
 b0SDmnJK9V+CdBQrA+Jp5UGQxcumpMxsAfqVznT0qkf/wDp44DCVgDz5Aj8cRbWU
 6xPfMgVLEnXiId9IgKqg3sJ2NmvMZXuI9sHM6hp6OzRmQDjTcx+LgRz7tnQHgaEk
 zGz8R6eDm3OA0wfApqZwJ6JY793HsDdy30W9L0Yi2PVGXfzwoEB8AqgLVwSDIY1B
 5hG5zn3tg9PSz9vhJ7M2h4AgFHdB3w3XGdJUafwqZEeqEIagw1iFCWlMyo/lE2dG
 G7oob9Jiiwxjc3RDWn2wGaafymrrWZwl2nYzC4O3UvJ3hVJ0mEl1iJagK1m8LzfN
 fmnP7tTyPuoOXkzDogZ0YI3FrngO6430PoR2hUPkS1yce/a+IV0HQEmXbSDSwN80
 1d9zyC9TnPj6rFjZeaGxGK17BpkC0oIQCPq4OSJB4396wzAwMqoJjJVVWWeAK4UC
 PxzoXqAAaBFguSsDbuBMcXgiuUw/7DIZ/pdzsWSiCFgocgF5ZdJdieCNtGk0nbLM
 37R7HCauF93JDrkpUMKPnLXScb2IbEh31pFtKzptJYKwMxEiScXXiP3NE9hfX65i
 2zLkl2aBvd154RvVKNbp
 =GdeV
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Stable patches:
   - fix v4.2 SEEK on files over 2 gigs
   - Fix a layout segment reference leak when pNFS I/O falls back to inband I/O.
   - Fix recovery of recalled read delegations

  Bugfixes:
   - Fix a case where NFSv4 fails to send CLOSE after a server reboot
   - Fix sunrpc to wait for connections to complete before retrying
   - Fix sunrpc races between transport connect/disconnect and shutdown
   - Fix an infinite loop when layoutget fail with BAD_STATEID
   - nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
   - Fix a bogus WARN_ON_ONCE() in O_DIRECT when layout commit_through_mds is set
   - Fix layoutreturn/close ordering issues"

* tag 'nfs-for-4.3-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS41: make close wait for layoutreturn
  NFS: Skip checking ds_cinfo.buckets when lseg's commit_through_mds is set
  NFSv4.x/pnfs: Don't try to recover stateids twice in layoutget
  NFSv4: Recovery of recalled read delegations is broken
  NFS: Fix an infinite loop when layoutget fail with BAD_STATEID
  NFS: Do cleanup before resetting pageio read/write to mds
  SUNRPC: xs_sock_mark_closed() does not need to trigger socket autoclose
  SUNRPC: Lock the transport layer on shutdown
  nfs/filelayout: Fix NULL reference caused by double freeing of fh_array
  SUNRPC: Ensure that we wait for connections to complete before retrying
  SUNRPC: drop null test before destroy functions
  nfs: fix v4.2 SEEK on files over 2 gigs
  SUNRPC: Fix races between socket connection and destroy code
  nfs: fix pg_test page count calculation
  Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount
2015-09-25 11:33:52 -07:00
Linus Torvalds
ddff42e592 sound fixes for 4.3-rc3
This ended up with a larger set of fixes than wished, unfortunately.
 As diffstat shows, the majority of changes are for various ASoC
 drivers (Realtek, Wolfson codec drivers, etc), in addition to a couple
 of HD-audio regression fixes.  All these are reasonably small and
 nothing to scare much.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJWBTxTAAoJEGwxgFQ9KSmkzroQAIQNPQd3EFlVOQzRKTu6qFe4
 MiCOwHdc8GHTS2ZmXrFiL9gOn8MzMCUjVLaxxK9oVCdM6maC/IpbQz7mFQKXyqRW
 KkmVop/V0LNV4elw6RhDYYlDSwjP8vgD3MjJc8y3Hr8cpydrEUDDNFrdvpDBp5rE
 Tj9r2CxJBc2l06GteRiwjTZl0IjfLr9lseJiwtNHohmNJeDgVGHK3GOxOxiUxHgd
 VpdCG3VuzIdqNi1+Dp0PjFqHdw2aAgzaMbJL57X7nIktQwDM8bwM74cZMd4ykJIZ
 aIY8SrMuLMdu2yD80EqVLej3vy+fEAJovhA4GRA3bFfqBrrQtteNMMdJ01kqkZSI
 cQhjlJ3G7xwcF/88CzNI7eoyn5f5USl78NuCQvxvU2cqhuWeog/t0jgQcVCJBZFh
 Z+2QGlg/dg1EZ5WujQ8eETDAI9FYz0g56Q4N/reKtjCHjs8RRcfZXchn/J8ZvQQ0
 E2bRwazdZzeS3m4xzVka9Y1B3ei2F8CXJ9g1iQNPzkurHy4MStIuygp6ICIpvLlw
 RX/0c14p60TTz2SKXCNEwfcaW6F9pHqzmh4tEZXqq4LNgXgp17y7MNv/T+1yw/KW
 X8O7M40GAoUIH77Rr3UGayuCE38eNrggaox3BJvtyFtSy7RR7HS7TKSxzlNtkUCK
 5tN2uTta3YoA14OJUix9
 =cdb6
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This ended up with a larger set of fixes than wished, unfortunately.

  As diffstat shows, the majority of changes are for various ASoC
  drivers (Realtek, Wolfson codec drivers, etc), in addition to a couple
  of HD-audio regression fixes.  All these are reasonably small and
  nothing to scare much"

* tag 'sound-4.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
  ALSA: hda - Disable power_save_node for Thinkpads
  ALSA: hda/tegra - async probe for avoiding module loading deadlock
  ASoC: rt5645: Prevent the pop sound in case of playback and the jack is plugging
  ASoC: rt5645: Increase the delay time to remove the pop sound
  ASoC: rt5645: Use the type SOC_DAPM_SINGLE_AUTODISABLE to prevent the weird sound in runtime of power up
  ASoC: pxa: pxa2xx-ac97: fix dma requestor lines
  MAINTAINERS: Update website and git repo for Wolfson Microelectronics
  ASoC: fsl_ssi: Fix checking of dai format for AC97 mode
  ASoC: wm0010: fix error path
  ASoC: wm0010: fix memory leak
  ASoC: wm8960: correct the max register value of mic boost pga
  ASoC: wm8962: remove 64k sample rate support
  ASoC: davinci-mcasp: Fix devm_kasprintf format string
  ASoC: fix broken pxa SoC support
  ASoC: davinci-mcasp: Set .symmetric_rates = 1 in snd_soc_dai_driver
  ASoC: au1x: psc-i2s: Fix unused variable 'ret' warning
  ASoC: SPEAr: Make SND_SPEAR_SOC select SND_SOC_GENERIC_DMAENGINE_PCM
  ASoC: mediatek: Increase periods_min in capture
  ASoC: davinci-mcasp: Revise the FIFO threshold calculation
  ASoC: wm8960: correct gain value for input PGA and add microphone PGA
  ...
2015-09-25 11:25:30 -07:00
Linus Torvalds
966966a630 PCI updates for v4.3:
Resource management
     - Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
     - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas)
 
   MSI
     - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)
 
   Renesas R-Car host bridge driver
     - Add R8A7794 support (Sergei Shtylyov)
 
   Miscellaneous
     - Fix devfn for VPD access through function 0 (Alex Williamson)
     - Use function 0 VPD only for identical functions (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBUq2AAoJEFmIoMA60/r8wbwP/0D/+fKEPYJlB6hx1wLHpVk3
 K//vEwH0RgA3v2X53QUoHg94gTYhSZLKX0zdAFshbphE0HCZ6AO3UO+/ZJ3cui6J
 PYvKOnhby2ErNotZqrs3DQIM8rGgl0ZVgoFQrAWEvwiHRHI/r2ArK/oR4PiBjxJT
 StYuJoTkZIlJyHXza6tvHDcWi+Jc8t8r0HC4Vs32BlaVBQM0SH3CMxHfhJw/Q9xP
 WHFif1sH0N+p7WDyHH71C1T8POOgXY73BsD2AC0se3lRYZ9SVkOVy9ECGUucx8F6
 LDAuFelwRvW2Dr9kh38+5f8Xp155E+eZ6zRWW9/JlrUKVEtHhOFhtrRfDNKHuDCt
 B9ETrxDiSUFAdQ2weye9BK6aXK0CHF6YP3PCbvK77qFUUsN8csFSKktanKrFAbML
 CdjkVkEoeLHw+aXzyDg0pSBRZMQ24dTQDh7YqOFZGuEjCLPXOEQ8nitf0IzBB0KI
 4QetT/QK3bKkgtVKTwPP+s9f4g+fA/oiwJ21ZTV9hi/9upywTa/umCUvH9Fmb8Fp
 VZeTzugSht0+ioXpaF/6/KO0Ccp/t5uAHYeuBBMqiHX7ks8DdnfPCwbWNRKkg25O
 Qy7Y8VnnOtesRCAqBq5y/hHlLUluMkjYpEblYFiD6HBWcjUh6xE6LlIO4mwnjqWI
 zjB3w7+0GOrvS7dBSx0N
 =f4c3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "These are fixes for things we merged for v4.3 (VPD, MSI, and bridge
  window management), and a new Renesas R8A7794 SoC device ID.

  Details:

  Resource management:
   - Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
   - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn
     Helgaas)

  MSI:
   - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)

  Renesas R-Car host bridge driver:
   - Add R8A7794 support (Sergei Shtylyov)

  Miscellaneous:
   - Fix devfn for VPD access through function 0 (Alex Williamson)
   - Use function 0 VPD only for identical functions (Alex Williamson)"

* tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: rcar: Add R8A7794 support
  PCI: Use function 0 VPD for identical functions, regular VPD for others
  PCI: Fix devfn for VPD access through function 0
  PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses
  PCI: Clear IORESOURCE_UNSET when clipping a bridge window
  PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
2015-09-25 11:16:53 -07:00
Linus Torvalds
b6d980f493 AMD fixes for bugs introduced in the 4.2 merge window,
and a few PPC bug fixes too.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWBSn7AAoJEL/70l94x66Dxd4H/RT6kWWj9x4grEYUkcJUDyK2
 AXm7XcKQm04auwAic8Otr+ts/Qix/50kWmBe/TU0QLgqb8rj5Dj3yGFK6Z1y6mAz
 KvaxqMJd4tZGTqN0DDvC2ItEdzjfAdeJZo/FHXqPHVspG0G14T7STLna02LTBBEJ
 tNzY9qor8nFhg2fT2szqKaudUNgTqkCTpo57o2BrHE96SHG+m0WdpQCV1F5hPVpg
 Te0Pb7qX9xng5n3sQ7IV/t3QYbrza1ACwNQS9XJa0Yu6iEz7JdmVmzHQASK9ynn6
 hUHhsNYGx4IsPjPtfJk2GroNaRDZL+VMzw07tfcOvPx8xkS9hS63pwzmSBqfLrM=
 =Ywqn
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC
  bug fixes too"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: disable halt_poll_ns as default for s390x
  KVM: x86: fix off-by-one in reserved bits check
  KVM: x86: use correct page table format to check nested page table reserved bits
  KVM: svm: do not call kvm_set_cr0 from init_vmcb
  KVM: x86: trap AMD MSRs for the TSeg base and mask
  KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()
  KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit
  KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs
  kvm: svm: reset mmu on VCPU reset
2015-09-25 10:51:40 -07:00
Linus Torvalds
57cb635c5c powerpc fixes for 4.3 #2
- Wire up sys_membarrier()
  - cxl: Fix lockdep warning while creating afu_err_buff from Vaibhav
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBK25AAoJEFHr6jzI4aWAjpgP/2ZObQW0NToo88d3H2ERiKP9
 S6c8jAkq+NKEKiNyjQ2pf+rwruVjH/DYIMl6HE9wqyEfVtwPqIOjfaVZiXQG6fno
 +BPomCA1negqC3AvVuY9QELYD8rBcZ9nbUK1FeGwlZZySNteaSnz5xi/BvwTo1rm
 puJOSW4KCk/DpGnN6nuBVOJc7NKoGBgbTc3vXqyQVOF+Lu2BQlkfscbHgDnqVrT9
 R5tE6U6pQJVZFD15pxmp6dmib8ujoX0eVKFz89rzEsKTcDIBvnPTyjMYr3Y5Z5hS
 AhOKtunfZg6LOmeg+zd7u1FNwY3PL9ir59fWu5WUXIvqao67k04Li6eggzyZNQNV
 FT8gnj4pFzpNfv1Czm93Ki4dXd3uTYg02OuB3iPq3R3qKuL6cDwS2NGIl0y5iTqI
 kVVQp7u5UVdRdjCNgKmAe48kzVDzAR7B9OFdQBu0JE4NF107ubZKhpokEoagRyru
 CCe+WK1zXvj9S6UaX8f2Mg//oxTqz3jykxr5R2pHb9eFppuKFbj3hmF081OIrcBQ
 rQTR+3MMfEzqw19LaqLWRo70FSMSz8E79+vlHAjdSr5L7FY5YCmioWN231zq5W1R
 69ENkRaCLHdBLysitdrUXnf2uTcrwIRQuqkohG7JWys2+gz1/mqJUsa2rmXKAvuN
 JP3skFQMPlduIzSEy5rI
 =RQ5A
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Wire up sys_membarrier()
 - cxl: Fix lockdep warning while creating afu_err_buff from Vaibhav

* tag 'powerpc-4.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  cxl: Fix lockdep warning while creating afu_err_buff attribute
  powerpc: Wire up sys_membarrier()
2015-09-25 10:11:26 -07:00
David Hildenbrand
920552b213 KVM: disable halt_poll_ns as default for s390x
We observed some performance degradation on s390x with dynamic
halt polling. Until we can provide a proper fix, let's enable
halt_poll_ns as default only for supported architectures.

Architectures are now free to set their own halt_poll_ns
default value.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:30 +02:00
Paolo Bonzini
58c95070da KVM: x86: fix off-by-one in reserved bits check
29ecd66019 ("KVM: x86: avoid uninitialized variable warning",
2015-09-06) introduced a not-so-subtle problem, which probably
escaped review because it was not part of the patch context.

Before the patch, leaf was always equal to iterator.level.  After,
it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
and when is_shadow_zero_bits_set does another "-1" the check on
reserved bits becomes incorrect.  Using "iterator.level" in the call
fixes this call trace:

WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
[...]
Call Trace:
  dump_stack+0x4e/0x84
  warn_slowpath_common+0x95/0xe0
  warn_slowpath_null+0x1a/0x20
  handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
  tdp_page_fault+0x231/0x290 [kvm]
  ? emulator_pio_in_out+0x6e/0xf0 [kvm]
  kvm_mmu_page_fault+0x36/0x240 [kvm]
  ? svm_set_cr0+0x95/0xc0 [kvm_amd]
  pf_interception+0xde/0x1d0 [kvm_amd]
  handle_exit+0x181/0xa70 [kvm_amd]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  ? preempt_count_sub+0x9b/0xf0
  ? mutex_lock_killable_nested+0x26f/0x490
  ? preempt_count_sub+0x9b/0xf0
  kvm_vcpu_ioctl+0x358/0x710 [kvm]
  ? __fget+0x5/0x210
  ? __fget+0x101/0x210
  do_vfs_ioctl+0x2f4/0x560
  ? __fget_light+0x29/0x90
  SyS_ioctl+0x4c/0x90
  entry_SYSCALL_64_fastpath+0x16/0x73
---[ end trace 37901c8686d84de6 ]---

Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:29 +02:00
Paolo Bonzini
6fec21449a KVM: x86: use correct page table format to check nested page table reserved bits
Intel CPUID on AMD host or vice versa is a weird case, but it can
happen.  Handle it by checking the host CPU vendor instead of the
guest's in reset_tdp_shadow_zero_bits_mask.  For speed, the
check uses the fact that Intel EPT has an X (executable) bit while
AMD NPT has NX.

Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:28 +02:00
Paolo Bonzini
79a8059d24 KVM: svm: do not call kvm_set_cr0 from init_vmcb
kvm_set_cr0 may want to call kvm_zap_gfn_range and thus access the
memslots array (SRCU protected).  Using a mini SRCU critical section
is ugly, and adding it to kvm_arch_vcpu_create doesn't work because
the VMX vcpu_create callback calls synchronize_srcu.

Fixes this lockdep splat:

===============================
[ INFO: suspicious RCU usage. ]
4.3.0-rc1+ #1 Not tainted
-------------------------------
include/linux/kvm_host.h:488 suspicious rcu_dereference_check() usage!

other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-i38/17000:
 #0:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: kvm_zap_gfn_range+0x24/0x1a0 [kvm]

[...]
Call Trace:
 dump_stack+0x4e/0x84
 lockdep_rcu_suspicious+0xfd/0x130
 kvm_zap_gfn_range+0x188/0x1a0 [kvm]
 kvm_set_cr0+0xde/0x1e0 [kvm]
 init_vmcb+0x760/0xad0 [kvm_amd]
 svm_create_vcpu+0x197/0x250 [kvm_amd]
 kvm_arch_vcpu_create+0x47/0x70 [kvm]
 kvm_vm_ioctl+0x302/0x7e0 [kvm]
 ? __lock_is_held+0x51/0x70
 ? __fget+0x101/0x210
 do_vfs_ioctl+0x2f4/0x560
 ? __fget_light+0x29/0x90
 SyS_ioctl+0x4c/0x90
 entry_SYSCALL_64_fastpath+0x16/0x73

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:22 +02:00
David S. Miller
b626ef0128 Merge branch 'phy-mdio-refcnt'
Russell King says:

====================
Phy, mdiobus, and netdev struct device fixes

The third version of this series fixes the build error which David
identified, and drops the broken changes for the Cavium Thunger BGX
ethernet driver as this driver requires some complex changes to
resolve the leakage - and this is best done by people who can test
the driver.

Compared to v2, the only patch which has changed is patch 6
  "net: fix phy refcounting in a bunch of drivers"

I _think_ I've been able to build-test all the drivers touched by
that patch to some degree now, though several of them needed the
Kconfig hacked to allow it (not all had || COMPILE_TEST clause on
their dependencies.)

Previous cover letters below:

This is the second version of the series, with the comments David had
on the first patch fixed up.  Original series description with updated
diffstat below.

While looking at the DSA code, I noticed we have a
of_find_net_device_by_node(), and it looks like users of that are
similarly buggy - it looks like net/dsa/dsa.c is the only user.  Fix
that too.

Hi,

While looking at the phy code, I identified a number of weaknesses
where refcounting on device structures was being leaked, where
modules could be removed while in-use, and where the fixed-phy could
end up having unintended consequences caused by incorrect calls to
fixed_phy_update_state().

This patch series resolves those issues, some of which were discovered
with testing on an Armada 388 board.  Not all patches are fully tested,
particularly the one which touches several network drivers.

When resolving the struct device refcounting problems, several different
solutions were considered before settling on the implementation here -
one of the considerations was to avoid touching many network drivers.
The solution here is:

	phy_attach*() - takes a refcount
	phy_detach*() - drops the phy_attach refcount

Provided drivers always attach and detach their phys, which they should
already be doing, this should change nothing, even if they leak a refcount.

	of_phy_find_device() and of_* functions which use that take
	a refcount.  Arrange for this refcount to be dropped once
	the phy is attached.

This is the reason why the previous change is important - we can't drop
this refcount taken by of_phy_find_device() until something else holds
a reference on the device.  This resolves the leaked refcount caused by
using of_phy_connect() or of_phy_attach().

Even without the above changes, these drivers are leaking by calling
of_phy_find_device().  These drivers are addressed by adding the
appropriate release of that refcount.

The mdiobus code also suffered from the same kind of leak, but thankfully
this only happened in one place - the mdio-mux code.

I also found that the try_module_get() in the phy layer code was utterly
useless: phydev->dev.driver was guaranteed to always be NULL, so
try_module_get() was always being called with a NULL argument.  I proved
this with my SFP code, which declares its own MDIO bus - the module use
count was never incremented irrespective of how I set the MDIO bus up.
This allowed the MDIO bus code to be removed from the kernel while there
were still PHYs attached to it.

One other bug was discovered: while using in-band-status with mvneta, it
was found that if a real phy is attached with in-band-status enabled,
and another ethernet interface is using the fixed-phy infrastructure, the
interface using the fixed-phy infrastructure is configured according to
the other interface using the in-band-status - which is caused by the
fixed-phy code not verifying that the phy_device passed in is actually
a fixed-phy device, rather than a real MDIO phy.

Lastly, having mdio_bus reversing phy_device_register() internals seems
like a layering violation - it's trivial to move that code to the phy
device layer.
====================

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:53 -07:00
Russell King
9861f72074 net: fix net_device refcounting
of_find_net_device_by_node() uses class_find_device() internally to
lookup the corresponding network device.  class_find_device() returns
a reference to the embedded struct device, with its refcount
incremented.

Add a comment to the definition in net/core/net-sysfs.c indicating the
need to drop this refcount, and fix the DSA code to drop this refcount
when the OF-generated platform data is cleaned up and freed.  Also
arrange for the ref to be dropped when handling errors.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:53 -07:00
Russell King
38737e490d phy: add phy_device_remove()
Add a phy_device_remove() function to complement phy_device_register(),
which undoes the effects of phy_device_register() by removing the phy
device from visibility, but not freeing it.

This allows these details to be moved out of the mdio bus code into
the phy code where this action belongs.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:53 -07:00
Russell King
d618bf2bfd phy: fixed-phy: properly validate phy in fixed_phy_update_state()
Validate that the phy_device passed into fixed_phy_update_state() is a
fixed-phy device before walking the list of phys for a fixed phy at the
same address.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:53 -07:00
Russell King
04d53b20fe net: fix phy refcounting in a bunch of drivers
of_phy_find_device() increments the phy struct device refcount, which
we need to properly balance.  Add code to network drivers using this
function to ensure that the struct device refcount is correctly
balanced.

For xgene, looking back in the history, we should be able to use
of_phy_connect() with a zero flags argument for the DT case as this is
how the driver used to operate prior to de7b5b3d79 ("net: eth: xgene:
change APM X-Gene SoC platform ethernet to support ACPI").

This leaves the Cavium Thunder BGX unfixed; fixing this driver is a
complicated task, one which the maintainers need to be involved with.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:53 -07:00
Russell King
f018ae7a8c of_mdio: fix MDIO phy device refcounting
bus_find_device() is defined as:

 * This is similar to the bus_for_each_dev() function above, but it
 * returns a reference to a device that is 'found' for later use, as
 * determined by the @match callback.

and it does indeed return a reference-counted pointer to the device:

        while ((dev = next_device(&i)))
                if (match(dev, data) && get_device(dev))
                                        ^^^^^^^^^^^^^^^
                        break;
        klist_iter_exit(&i);
        return dev;

What that means is that when we're done with the struct device, we must
drop that reference.  Neither of_phy_connect() nor of_phy_attach() did
this when phy_connect_direct() or phy_attach_direct() failed.

With our previous patch, phy_connect_direct() and phy_attach_direct()
take a new refcount on the phy device when successful, so we can drop
our local reference immediatley after these functions, whether or not
they succeeded.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:52 -07:00
Russell King
7322967bc1 phy: add proper phy struct device refcounting
Take a refcount on the phy struct device when the phy device is attached
to a network device, and drop it after it's detached.  This ensures that
a refcount is held on the phy device while the device is being used by
a network device, thereby preventing the phy_device from being
unexpectedly kfree()'d by phy_device_release().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:52 -07:00
Russell King
3e3aaf6494 phy: fix mdiobus module safety
Re-implement the mdiobus module refcounting to ensure that we actually
ensure that the mdiobus module code does not go away while we might call
into it.

The old scheme using bus->dev.driver was buggy, because bus->dev is a
class device which never has a struct device_driver associated with it,
and hence the associated code trying to obtain a refcount did nothing
useful.

Instead, take the approach that other subsystems do: pass the module
when calling mdiobus_register(), and record that in the mii_bus struct.
When we need to increment the module use count in the phy code, use
this stored pointer.  When the phy is deteched, drop the module
refcount, remembering that the phy device might go away at that point.

This doesn't stop the mii_bus going away while there are in-use phys -
it merely stops the underlying code vanishing.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:52 -07:00
Russell King
e496ae690b net: dsa: fix of_mdio_find_bus() device refcount leak
Current users of of_mdio_find_bus() leak a struct device refcount, as
they fail to clean up the reference obtained inside class_find_device().

Fix the DSA code to properly refcount the returned MDIO bus by:
1. taking a reference on the struct device whenever we assign it to
   pd->chip[x].host_dev.
2. dropping the reference when we overwrite the existing reference.
3. dropping the reference when we free the data structure.
4. dropping the initial reference we obtained after setting up the
   platform data structure, or on failure.

In step 2 above, where we obtain a new MDIO bus, there is no need to
take a reference on it as we would only have to drop it immediately
after assignment again, iow:

	put_device(cd->host_dev);	/* drop original assignment ref */
	cd->host_dev = get_device(&mdio_bus_switch->dev); /* get our ref */
	put_device(&mdio_bus_switch->dev); /* drop of_mdio_find_bus ref */

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:52 -07:00
Russell King
a136442131 phy: fix of_mdio_find_bus() device refcount leak
of_mdio_find_bus() leaks a struct device refcount, caused by using
class_find_device() and not realising that the device reference has
its refcount incremented:

 * Note, you will need to drop the reference with put_device() after use.
...
        while ((dev = class_dev_iter_next(&iter))) {
                if (match(dev, data)) {
                        get_device(dev);
                        break;
                }

Update the comment, and arrange for the phy code to drop this refcount
when disposing of a reference to it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-24 23:04:52 -07:00
Linus Torvalds
ced255c0c5 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:

 - Power allocator governor changes to allow binding on thermal zones
   with missing power estimates information.  From Javi Merino.

 - Add compile test flags on thermal drivers that allow it without
   producing compilation errors.  From Eduardo Valentin.

 - Fixes around memory allocation on cpu_cooling.  From Javi Merino.

 - Fix on db8500 cpufreq code to allow autoload.  From Luis de
   Bethencourt.

 - Maintainer entries for cpu cooling device

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: power_allocator: exit early if there are no cooling devices
  thermal: power_allocator: don't require tzp to be present for the thermal zone
  thermal: power_allocator: relax the requirement of two passive trip points
  thermal: power_allocator: relax the requirement of a sustainable_power in tzp
  thermal: Add a function to get the minimum power
  thermal: cpu_cooling: free power table on error or when unregistering
  thermal: cpu_cooling: don't call kcalloc() under rcu_read_lock
  thermal: db8500_cpufreq_cooling: Fix module autoload for OF platform driver
  thermal: cpu_cooling: Add MAINTAINERS entry
  thermal: ti-soc: Kconfig fix to avoid menu showing wrongly
  thermal: ti-soc: allow compile test
  thermal: qcom_spmi: allow compile test
  thermal: exynos: allow compile test
  thermal: armada: allow compile test
  thermal: dove: allow compile test
  thermal: kirkwood: allow compile test
  thermal: rockchip: allow compile test
  thermal: spear: allow compile test
  thermal: hisi: allow compile test
  thermal: Fix thermal_zone_of_sensor_register to match documentation
2015-09-24 20:14:26 -07:00
Linus Torvalds
4401555a98 DeviceTree fixes for 4.3:
- Silence bogus warning for of_irq_parse_pci
 - Fix typo in ARM idle-states binding doc and dts files
 - Various minor binding documentation updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWBI2/AAoJEMhvYp4jgsXi08cH/11S/92w1PLNmWCv4y7TXHQw
 XTJi7/Hk+q6otV/FigXukdYClR17h/3mFyCKKXHrkAgy/AoSrvN/ABe0bLLoT2AQ
 xh0Rx8F6vZwa6ro1MZcrn3ZxQkoJlNUhoIXtY84oSPWd1ernLOar6HonFiynCQQc
 bDZo5zoLj6DBbSO+UpVvEQN57ogwPgFZ1hGDjJeyyH8c1755z2OVA+k8O0dwjmqW
 Xav/7TO8bFEIbUnfWeKnVyK45qmJxHcTbn3nxUgYFQj3DMI2Hn86WEY4cg8QvTo+
 SpdO1Aio6b9NSTNnpiSvPnc2MFNPFaWjiqm1w86w+PTm8oUT+p1V0OG/EE27fcY=
 =VUfk
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:
 - Silence bogus warning for of_irq_parse_pci
 - Fix typo in ARM idle-states binding doc and dts files
 - Various minor binding documentation updates

* tag 'devicetree-fixes-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  Documentation: arm: Fix typo in the idle-states bindings examples
  gpio: mention in DT binding doc that <name>-gpio is deprecated
  of_pci_irq: Silence bogus "of_irq_parse_pci() failed ..." messages.
  devicetree: bindings: Extend the bma180 bindings with bma250 info
  of: thermal: Mark cooling-*-level properties optional
  of: thermal: Fix inconsitency between cooling-*-state and cooling-*-level
  Docs: dt: add #msi-cells to GICv3 ITS binding
  of: add vendor prefix for Socionext Inc.
2015-09-24 17:46:38 -07:00