Commit Graph

795 Commits

Author SHA1 Message Date
David Ahern 242afaa696 net/ipv6: Put target net when address dump fails due to bad attributes
If tgt_net is set based on IFA_TARGET_NETNSID attribute in the dump
request, make sure all error paths call put_net.

Fixes: 6371a71f3a ("net/ipv6: Add support for dumping addresses for a specific device")
Fixes: ed6eff1179 ("net/ipv6: Update inet6_dump_addr for strict data checking")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-24 14:06:25 -07:00
David Ahern 6371a71f3a net/ipv6: Add support for dumping addresses for a specific device
If an RTM_GETADDR dump request has ifa_index set in the ifaddrmsg
header, then return only the addresses for that device.

Since inet6_dump_addr is reused for multicast and anycast addresses,
this adds support for device specfic dumps of RTM_GETMULTICAST and
RTM_GETANYCAST as well.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 19:33:29 -07:00
David Ahern fe884c2be0 net/ipv6: Remove ip_idx arg to in6_dump_addrs
ip_idx is always 0 going into in6_dump_addrs; it is passed as a pointer
to save the last good index into cb. Since cb is already argument to
in6_dump_addrs, just save the value there.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-22 19:33:29 -07:00
David S. Miller 21ea1d36f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David Ahern's dump indexing bug fix in 'net' overlapped the
change of the function signature of inet6_fill_ifaddr() in
'net-next'.  Trivially resolved.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-21 11:54:28 -07:00
David Ahern 4ba4c566ba net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
The loop wants to skip previously dumped addresses, so loops until
current index >= saved index. If the message fills it wants to save
the index for the next address to dump - ie., the one that did not
fit in the current message.

Currently, it is incrementing the index counter before comparing to the
saved index, and then the saved index is off by 1 - it assumes the
current address is going to fit in the message.

Change the index handling to increment only after a succesful dump.

Fixes: 502a2ffd73 ("ipv6: convert idev_list to list macros")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-20 15:43:14 -07:00
David Ahern addd383f5a net: Update netconf dump handlers for strict data checking
Update inet_netconf_dump_devconf, inet6_netconf_dump_devconf, and
mpls_netconf_dump_devconf for strict data checking. If the flag is set,
the dump request is expected to have an netconfmsg struct as the header.
The struct only has the family member and no attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern 786e0007e2 rtnetlink: Update inet6_dump_ifinfo for strict data checking
Update inet6_dump_ifinfo for strict data checking. If the flag is
set, the dump request is expected to have an ifinfomsg struct as
the header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern ed6eff1179 net/ipv6: Update inet6_dump_addr for strict data checking
Update inet6_dump_addr for strict data checking. If the flag is set, the
dump request is expected to have an ifaddrmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values suppored by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID
attribute is supported. Follow on patches can add support for other fields
(e.g., honor ifa_index and only return data for the given device index).

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern 6ba1e6e856 net/ipv6: Refactor address dump to push inet6_fill_args to in6_dump_addrs
Pull the inet6_fill_args arg up to in6_dump_addrs and move netnsid
into it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern dac9c9790e net: Add extack to nlmsg_parse
Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David S. Miller a06ee256e5 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Version bump conflict in batman-adv, take what's in net-next.

iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-25 10:35:29 -07:00
Jeff Barnhill 86f9bd1ff6 net/ipv6: Display all addresses in output of /proc/net/if_inet6
The backend handling for /proc/net/if_inet6 in addrconf.c doesn't properly
handle starting/stopping the iteration.  The problem is that at some point
during the iteration, an overflow is detected and the process is
subsequently stopped.  The item being shown via seq_printf() when the
overflow occurs is not actually shown, though.  When start() is
subsequently called to resume iterating, it returns the next item, and
thus the item that was being processed when the overflow occurred never
gets printed.

Alter the meaning of the private data member "offset".  Currently, when it
is not 0 (which only happens at the very beginning), "offset" represents
the next hlist item to be printed.  After this change, "offset" always
represents the current item.

This is also consistent with the private data member "bucket", which
represents the current bucket, and also the use of "pos" as defined in
seq_file.txt:
    The pos passed to start() will always be either zero, or the most
    recent pos used in the previous session.

Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:10:49 -07:00
Robert Shearman 3ede0bbcdf ipv6: Allow the l3mdev to be a loopback
There is no way currently for an IPv6 client connect using a loopback
address in a VRF, whereas for IPv4 the loopback address can be added:

    $ sudo ip addr add dev vrfred 127.0.0.1/8
    $ sudo ip -6 addr add ::1/128 dev vrfred
    RTNETLINK answers: Cannot assign requested address

So allow ::1 to be configured on an L3 master device. In order for
this to be usable ip_route_output_flags needs to not consider ::1 to
be a link scope address (since oif == l3mdev and so it would be
dropped), and ipv6_rcv needs to consider the l3mdev to be a loopback
device so that it doesn't drop the packets.

Signed-off-by: Robert Shearman <rshearma@vyatta.att-mail.com>
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-19 21:23:01 -07:00
Christian Brauner 203651b665 ipv6: add inet6_fill_args
inet6_fill_if{addr,mcaddr, acaddr}() already took 6 arguments which
meant the 7th argument would need to be pushed onto the stack on x86.
Add a new struct inet6_fill_args which holds common information passed
to inet6_fill_if{addr,mcaddr, acaddr}() and shortens the functions to
three pointer arguments.

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:27:11 -07:00
Christian Brauner 6ecf4c37eb ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR
- Backwards Compatibility:
  If userspace wants to determine whether ipv6 RTM_GETADDR requests
  support the new IFA_TARGET_NETNSID property it should verify that the
  reply includes the IFA_TARGET_NETNSID property. If it does not
  userspace should assume that IFA_TARGET_NETNSID is not supported for
  ipv6 RTM_GETADDR requests on this kernel.
- From what I gather from current userspace tools that make use of
  RTM_GETADDR requests some of them pass down struct ifinfomsg when they
  should actually pass down struct ifaddrmsg. To not break existing
  tools that pass down the wrong struct we will do the same as for
  RTM_GETLINK | NLM_F_DUMP requests and not error out when the
  nlmsg_parse() fails.

- Security:
  Callers must have CAP_NET_ADMIN in the owning user namespace of the
  target network namespace.

Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-05 22:27:11 -07:00
Cong Wang e500c6d349 addrconf: reduce unnecessary atomic allocations
All the 3 callers of addrconf_add_mroute() assert RTNL
lock, they don't take any additional lock either, so
it is safe to convert it to GFP_KERNEL.

Same for sit_add_v4_addrs().

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22 21:42:07 -07:00
David S. Miller 19725496da Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2018-07-24 19:21:58 -07:00
Wei Wang e873e4b9cc ipv6: use fib6_info_hold_safe() when necessary
In the code path where only rcu read lock is held, e.g. in the route
lookup code path, it is not safe to directly call fib6_info_hold()
because the fib6_info may already have been deleted but still exists
in the rcu grace period. Holding reference to it could cause double
free and crash the kernel.

This patch adds a new function fib6_info_hold_safe() and replace
fib6_info_hold() in all necessary places.

Syzbot reported 3 crash traces because of this. One of them is:
8021q: adding VLAN 0 to HW filter on device team0
IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready
dst_release: dst:(____ptrval____) refcnt:-1
dst_release: dst:(____ptrval____) refcnt:-2
WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline]
WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204
dst_release: dst:(____ptrval____) refcnt:-1
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 panic+0x238/0x4e7 kernel/panic.c:184
dst_release: dst:(____ptrval____) refcnt:-2
dst_release: dst:(____ptrval____) refcnt:-3
 __warn.cold.8+0x163/0x1ba kernel/panic.c:536
dst_release: dst:(____ptrval____) refcnt:-4
 report_bug+0x252/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
dst_release: dst:(____ptrval____) refcnt:-5
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
RIP: 0010:dst_hold include/net/dst.h:239 [inline]
RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204
Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78
RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293
RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6
RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005
RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338
R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720
R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980
 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768
 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376
 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
 sock_sendmsg_nosec net/socket.c:641 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:651
 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125
 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220
 __do_sys_sendmmsg net/socket.c:2249 [inline]
 __se_sys_sendmmsg net/socket.c:2246 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x446ba9
Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9
RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003
RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843
R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..

Fixes: 93531c6743 ("net/ipv6: separate handling of FIB entries from dst based routes")
Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com
Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com
Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 11:19:02 -07:00
Sabrina Dubroca f24c5987dd net/ipv6: propagate net.ipv6.conf.all.addr_gen_mode to devices
This aligns the addr_gen_mode sysctl with the expected behavior of the
"all" variant.

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca bdd72f4133 net/ipv6: reserve room for IFLA_INET6_ADDR_GEN_MODE
inet6_ifla6_size() is called to check how much space is needed by
inet6_fill_link_af() and inet6_fill_ifinfo(), both of which include
the IFLA_INET6_ADDR_GEN_MODE attribute. Reserve some room for it.

Fixes: bc91b0f07a ("ipv6: addrconf: implement address generation modes")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca 70c30d76e5 net/ipv6: don't reinitialize ndev->cnf.addr_gen_mode on new inet6_dev
The value has already been copied from this netns's devconf_dflt, it
shouldn't be reset to the global kernel default.

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
Sabrina Dubroca c6dbf7aaa4 net/ipv6: fix addrconf_sysctl_addr_gen_mode
addrconf_sysctl_addr_gen_mode() has multiple problems. First, it ignores
the errors returned by proc_dointvec().

addrconf_sysctl_addr_gen_mode() calls proc_dointvec() directly, which
writes the value to memory, and then checks if it's valid and may return
EINVAL. If a bad value is given, the value displayed when reading
net.ipv6.conf.foo.addr_gen_mode next time will be invalid. In case the
value provided by the user was valid, addrconf_dev_config() won't be
called since idev->cnf.addr_gen_mode has already been updated.

Fix this in the usual way we deal with values that need to be checked
after the proc_do*() helper has returned: define a local ctl_table and
storage, call proc_dointvec() on that temporary area, then check and
store.

addrconf_sysctl_addr_gen_mode() also writes the new value to the global
ipv6_devconf_dflt, when we're writing to some netns's default, so that
new netns will inherit the value that was set by the change occuring in
any netns. That doesn't make any sense, so let's drop this assignment.

Finally, since addr_gen_mode is a __u32, switch to proc_douintvec().

Fixes: d35a00b8e3 ("net/ipv6: allow sysctl to change link-local address generation mode")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-11 22:50:45 -07:00
David Ahern e7c7faa936 net/ipv6: Fix updates to prefix route
Sowmini reported that a recent commit broke prefix routes for linklocal
addresses. The newly added modify_prefix_route is attempting to add a
new prefix route when the ifp priority does not match the route metric
however the check needs to account for the default priority. In addition,
the route add fails because the route already exists, and then the delete
removes the one that exists. Flip the order to do the delete first.

Fixes: 8308f3ff17 ("net/ipv6: Add support for specifying metric of connected routes")
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-30 20:49:58 +09:00
David Ahern 3f2d67b6bd net/ipv6: Ensure cfg is properly initialized in ipv6_create_tempaddr
Valdis reported a BUG in ipv6_add_addr:

[ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209
[ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10
[ 1820.832732] Code: 49 8b 1f 0f 84 6a 0a 00 00 48 85 db 0f 84 4e 0a 00 00 48 8b 03 48 8b 53 08 49 89 45 00 49 8b 47 10
49 89 55 08 48 85 c0 74 15 <48> 8b 50 08 48 8b 00 49 89 95 b8 01 00 00 49 89 85 b0 01 00 00 4c
[ 1820.832847] RSP: 0018:ffffaa07c2fd7880 EFLAGS: 00010202
[ 1820.832853] RAX: 0000000000000201 RBX: ffffaa07c2fd79b0 RCX: 0000000000000000
[ 1820.832858] RDX: a4cfbfba2cbfa64c RSI: 0000000000000000 RDI: ffffffff8a8e9fa0
[ 1820.832862] RBP: ffffaa07c2fd7920 R08: 000000000000017a R09: ffffffff8a555300
[ 1820.832866] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888d18e71c00
[ 1820.832871] R13: ffff888d0a9b1200 R14: 0000000000000000 R15: ffffaa07c2fd7980
[ 1820.832876] FS:  00007faa51bdb800(0000) GS:ffff888d1d400000(0000) knlGS:0000000000000000
[ 1820.832880] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1820.832885] CR2: 0000000000000209 CR3: 000000021e8f8001 CR4: 00000000001606e0
[ 1820.832888] Call Trace:
[ 1820.832898]  ? __local_bh_enable_ip+0x119/0x260
[ 1820.832904]  ? ipv6_create_tempaddr+0x259/0x5a0
[ 1820.832912]  ? __local_bh_enable_ip+0x139/0x260
[ 1820.832921]  ipv6_create_tempaddr+0x2da/0x5a0
[ 1820.832926]  ? ipv6_create_tempaddr+0x2da/0x5a0
[ 1820.832941]  manage_tempaddrs+0x1a5/0x240
[ 1820.832951]  inet6_addr_del+0x20b/0x3b0
[ 1820.832959]  ? nla_parse+0xce/0x1e0
[ 1820.832968]  inet6_rtm_deladdr+0xd9/0x210
[ 1820.832981]  rtnetlink_rcv_msg+0x1d4/0x5f0

Looking at the code I found 1 element (peer_pfx) of the newly introduced
ifa6_config struct that is not initialized. Use a memset rather than hard
coding an init for each struct element.

Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Fixes: e6464b8c63 ("net/ipv6: Convert ipv6_add_addr to struct ifa6_config")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-11 16:39:02 -07:00
Linus Torvalds 1c8c5a9d38 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add Maglev hashing scheduler to IPVS, from Inju Song.

 2) Lots of new TC subsystem tests from Roman Mashak.

 3) Add TCP zero copy receive and fix delayed acks and autotuning with
    SO_RCVLOWAT, from Eric Dumazet.

 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
    Brouer.

 5) Add ttl inherit support to vxlan, from Hangbin Liu.

 6) Properly separate ipv6 routes into their logically independant
    components. fib6_info for the routing table, and fib6_nh for sets of
    nexthops, which thus can be shared. From David Ahern.

 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
    messages from XDP programs. From Nikita V. Shirokov.

 8) Lots of long overdue cleanups to the r8169 driver, from Heiner
    Kallweit.

 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.

10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.

11) Plumb extack down into fib_rules, from Roopa Prabhu.

12) Add Flower classifier offload support to igb, from Vinicius Costa
    Gomes.

13) Add UDP GSO support, from Willem de Bruijn.

14) Add documentation for eBPF helpers, from Quentin Monnet.

15) Add TLS tx offload to mlx5, from Ilya Lesokhin.

16) Allow applications to be given the number of bytes available to read
    on a socket via a control message returned from recvmsg(), from
    Soheil Hassas Yeganeh.

17) Add x86_32 eBPF JIT compiler, from Wang YanQing.

18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
    From Björn Töpel.

19) Remove indirect load support from all of the BPF JITs and handle
    these operations in the verifier by translating them into native BPF
    instead. From Daniel Borkmann.

20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.

21) Allow XDP programs to do lookups in the main kernel routing tables
    for forwarding. From David Ahern.

22) Allow drivers to store hardware state into an ELF section of kernel
    dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.

23) Various RACK and loss detection improvements in TCP, from Yuchung
    Cheng.

24) Add TCP SACK compression, from Eric Dumazet.

25) Add User Mode Helper support and basic bpfilter infrastructure, from
    Alexei Starovoitov.

26) Support ports and protocol values in RTM_GETROUTE, from Roopa
    Prabhu.

27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
    Brouer.

28) Add lots of forwarding selftests, from Petr Machata.

29) Add generic network device failover driver, from Sridhar Samudrala.

* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
  strparser: Add __strp_unpause and use it in ktls.
  rxrpc: Fix terminal retransmission connection ID to include the channel
  net: hns3: Optimize PF CMDQ interrupt switching process
  net: hns3: Fix for VF mailbox receiving unknown message
  net: hns3: Fix for VF mailbox cannot receiving PF response
  bnx2x: use the right constant
  Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
  net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
  enic: fix UDP rss bits
  netdev-FAQ: clarify DaveM's position for stable backports
  rtnetlink: validate attributes in do_setlink()
  mlxsw: Add extack messages for port_{un, }split failures
  netdevsim: Add extack error message for devlink reload
  devlink: Add extack to reload and port_{un, }split operations
  net: metrics: add proper netlink validation
  ipmr: fix error path when ipmr_new_table fails
  ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
  net: hns3: remove unused hclgevf_cfg_func_mta_filter
  netfilter: provide udp*_lib_lookup for nf_tproxy
  qed*: Utilize FW 8.37.2.0
  ...
2018-06-06 18:39:49 -07:00
Subash Abhinov Kasiviswanathan 9deb441c11 net: ipv6: Generate random IID for addresses on RAWIP devices
RAWIP devices such as rmnet do not have a hardware address and
instead require the kernel to generate a random IID for the
IPv6 addresses.

Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-05 10:16:25 -04:00
David Ahern 8308f3ff17 net/ipv6: Add support for specifying metric of connected routes
Add support for IFA_RT_PRIORITY to ipv6 addresses.

If the metric is changed on an existing address then the new route
is inserted before removing the old one. Since the metric is one
of the route keys, the prefix route can not be atomically replaced.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:45 -04:00
David Ahern d169a1f8ba net/ipv6: Pass ifa6_config struct to inet6_addr_modify
Update inet6_addr_modify to take ifa6_config argument versus a parameter
list. This is an argument move only; no functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:45 -04:00
David Ahern 19b1518c29 net/ipv6: Pass ifa6_config struct to inet6_addr_add
Move the creation of struct ifa6_config up to callers of inet6_addr_add.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:44 -04:00
David Ahern e6464b8c63 net/ipv6: Convert ipv6_add_addr to struct ifa6_config
Move config parameters for adding an ipv6 address to a struct. struct
names stem from inet6_rtm_newaddr which is the modern handler for
adding an address.

Start the conversion to ifa6_config with ipv6_add_addr. This is an argument
move only; no functional change intended. Mapping of variable changes:

    addr      -->  cfg->pfx
    peer_addr -->  cfg->peer_pfx
    pfxlen    -->  cfg->plen
    flags     -->  cfg->ifa_flags

scope, valid_lft, prefered_lft have the same names within cfg
(with corrected spelling).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-29 10:12:44 -04:00
Christoph Hellwig c350637227 proc: introduce proc_create_net{,_data}
Variants of proc_create{,_data} that directly take a struct seq_operations
and deal with network namespaces in ->open and ->release.  All callers of
proc_create + seq_open_net converted over, and seq_{open,release}_net are
removed entirely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-16 07:24:30 +02:00
Ivan Vecera 0aef78aa7b ipv6: addrconf: don't evaluate keep_addr_on_down twice
The addrconf_ifdown() evaluates keep_addr_on_down state twice. There
is no need to do it.

Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-25 13:03:37 -04:00
David Ahern 27b10608a2 net/ipv6: Fix gfp_flags arg to addrconf_prefix_route
Eric noticed that __ipv6_ifa_notify is called under rcu_read_lock, so
the gfp argument to addrconf_prefix_route can not be GFP_KERNEL.

While scrubbing other calls I noticed addrconf_addr_gen has one
place with GFP_ATOMIC that can be GFP_KERNEL.

Fixes: acb54e3cba ("net/ipv6: Add gfp_flags to route add functions")
Reported-by: syzbot+2add39b05179b31f912f@syzkaller.appspotmail.com
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19 15:40:13 -04:00
David Ahern 9ee8cbb2fd net/ipv6: Remove aca_idev
aca_idev has only 1 user - inet6_fill_ifacaddr - and it only
wants the device index which can be extracted from the fib6_info
nexthop.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19 15:40:13 -04:00
David Ahern 360a9887c8 net/ipv6: Rename addrconf_dst_alloc
addrconf_dst_alloc now returns a fib6_info. Update the name
and its users to reflect the change.

Rename only; no functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19 15:40:13 -04:00
David Ahern 93c2fb253d net/ipv6: Rename fib6_info struct elements
Change the prefix for fib6_info struct elements from rt6i_ to fib6_.
rt6i_pcpu and rt6i_exception_bucket are left as is given that they
point to rt6_info entries.

Rename only; not functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19 15:40:12 -04:00
David Ahern 8d1c802b28 net/ipv6: Flip FIB entries to fib6_info
Convert all code paths referencing a FIB entry from
rt6_info to fib6_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:18 -04:00
David Ahern 93531c6743 net/ipv6: separate handling of FIB entries from dst based routes
Last step before flipping the data type for FIB entries:
- use fib6_info_alloc to create FIB entries in ip6_route_info_create
  and addrconf_dst_alloc
- use fib6_info_release in place of dst_release, ip6_rt_put and
  rt6_release
- remove the dst_hold before calling __ip6_ins_rt or ip6_del_rt
- when purging routes, drop per-cpu routes
- replace inc and dec of rt6i_ref with fib6_info_hold and fib6_info_release
- use rt->from since it points to the FIB entry
- drop references to exception bucket, fib6_metrics and per-cpu from
  dst entries (those are relevant for fib entries only)

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:17 -04:00
David Ahern acb54e3cba net/ipv6: Add gfp_flags to route add functions
Most FIB entries can be added using memory allocated with GFP_KERNEL.
Add gfp_flags to ip6_route_add and addrconf_dst_alloc. Code paths that
can be reached from the packet path (e.g., ndisc and autoconfig) or
atomic notifiers use GFP_ATOMIC; paths from user context (adding
addresses and routes) use GFP_KERNEL.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:17 -04:00
David Ahern 3b6761d18b net/ipv6: Move dst flags to booleans in fib entries
Continuing to wean FIB paths off of dst_entry, use a bool to hold
requests for certain dst settings. Add a helper to convert the
flags to DST flags when a FIB entry is converted to a dst_entry.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:17 -04:00
David Ahern 14895687d3 net/ipv6: move expires into rt6_info
Add expires to rt6_info for FIB entries, and add fib6 helpers to
manage it. Data path use of dst.expires remains.

The transition is fairly straightforward: when working with fib entries,
rt->dst.expires is just rt->expires, rt6_clean_expires is replaced with
fib6_clean_expires, rt6_set_expires becomes fib6_set_expires, and
rt6_check_expired becomes fib6_check_expired, where the fib6 versions
are added by this patch.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:17 -04:00
David Ahern 5e670d844b net/ipv6: Move nexthop data to fib6_nh
Introduce fib6_nh structure and move nexthop related data from
rt6_info and rt6_info.dst to fib6_nh. References to dev, gateway or
lwtstate from a FIB lookup perspective are converted to use fib6_nh;
datapath references to dst version are left as is.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:16 -04:00
David Ahern e8478e80e5 net/ipv6: Save route type in rt6_info
The RTN_ type for IPv6 FIB entries is currently embedded in rt6i_flags
and dst.error. Since dst is going to be removed, it can no longer be
relied on for FIB dumps so save the route type as fib6_type.

fc_type is set in current users based on the algorithm in rt6_fill_node:
  - rt6i_flags contains RTF_LOCAL: fc_type = RTN_LOCAL
  - rt6i_flags contains RTF_ANYCAST: fc_type = RTN_ANYCAST
  - else fc_type = RTN_UNICAST

Similarly, fib6_type is set in the rt6_info templates based on the
RTF_REJECT section of rt6_fill_node converting dst.error to RTN type.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:16 -04:00
David Ahern afb1d4b593 net/ipv6: Pass net namespace to route functions
Pass network namespace reference into route add, delete and get
functions.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:16 -04:00
Lorenzo Bianconi a2d481b326 ipv6: send netlink notifications for manually configured addresses
Send a netlink notification when userspace adds a manually configured
address if DAD is enabled and optimistic flag isn't set.
Moreover send RTM_DELADDR notifications for tentative addresses.

Some userspace applications (e.g. NetworkManager) are interested in
addr netlink events albeit the address is still in tentative state,
however events are not sent if DAD process is not completed.
If the address is added and immediately removed userspace listeners
are not notified. This behaviour can be easily reproduced by using
veth interfaces:

$ ip -b - <<EOF
> link add dev vm1 type veth peer name vm2
> link set dev vm1 up
> link set dev vm2 up
> addr add 2001:db8:a🅱️1:2:3:4/64 dev vm1
> addr del 2001:db8:a🅱️1:2:3:4/64 dev vm1
EOF

This patch reverts the behaviour introduced by the commit f784ad3d79
("ipv6: do not send RTM_DELADDR for tentative addresses")

Suggested-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 14:03:56 -04:00
Lorenzo Bianconi f85f94b871 ipv6: remove unnecessary check in addrconf_prefix_rcv_add_addr()
Remove unnecessary check on update_lft variable in
addrconf_prefix_rcv_add_addr routine since it is always set to 0.
Moreover remove update_lft re-initialization to 0

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-16 18:16:16 -04:00
Kirill Tkhai 2f635ceeb2 net: Drop pernet_operations::async
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 13:18:09 -04:00
Joe Perches e32ac25018 ipv6: addrconf: Use normal debugging style
Remove local ADBG macro and use netdev_dbg/pr_debug

Miscellanea:

o Remove unnecessary debug message after allocation failure as there
  already is a dump_stack() on the failure paths
o Leave the allocation failure message on snmp6_alloc_dev as there
  is one code path that does not do a dump_stack()

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-27 10:54:40 -04:00
Joe Perches d6444062f8 net: Use octal not symbolic permissions
Prefer the direct use of octal for permissions.

Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
and some typing.

Miscellanea:

o Whitespace neatening around these conversions.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 12:07:48 -04:00
David Ahern 1893ff2027 net/ipv6: Add l3mdev check to ipv6_chk_addr_and_flags
Lookup the L3 master device for the passed in device. Only consider
addresses on netdev's with the same master device. If the device is
not enslaved or is NULL, then the l3mdev is NULL which means only
devices not enslaved (ie, in the default domain) are considered.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-16 11:28:38 -04:00