Macvtap should be compatible with tuntap for
maximum number of queues.
commit 'baf71c5c1f80d82e92924050a60b5baaf97e3094 (tuntap:
Increase the number of queues in tun.)' removes
the limitations and increases number of queues in tuntap.
Now, Its safe to increase number of queues in Macvtap as well.
This patch also modifies 'macvtap_del_queues' function
to avoid extra memory allocation in stack.
Changes from v1->v2 :
Michael S. Tsirkin, Jason Wang :
Better way to use linked list to
avoid use of extra memory in stack.
Sergei Shtylyov : Specify dependent commit's summary.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new mode of operation to macvlan, called "source".
It allows one to set a list of allowed mac address, which is used
to match against source mac address from received frames on underlying
interface.
This enables creating mac based VLAN associations, instead of standard
port or tag based. The feature is useful to deploy 802.1x mac based
behavior, where drivers of underlying interfaces doesn't allows that.
Configuration is done through the netlink interface using e.g.:
ip link add link eth0 name macvlan0 type macvlan mode source
ip link add link eth0 name macvlan1 type macvlan mode source
ip link set link dev macvlan0 type macvlan macaddr add 00:11:11:11:11:11
ip link set link dev macvlan0 type macvlan macaddr add 00:22:22:22:22:22
ip link set link dev macvlan0 type macvlan macaddr add 00:33:33:33:33:33
ip link set link dev macvlan1 type macvlan macaddr add 00:33:33:33:33:33
ip link set link dev macvlan1 type macvlan macaddr add 00:44:44:44:44:44
This allows clients with MAC addresses 00:11:11:11:11:11,
00:22:22:22:22:22 to be part of only VLAN associated with macvlan0
interface. Clients with MAC addresses 00:44:44:44:44:44 with only VLAN
associated with macvlan1 interface. And client with MAC address
00:33:33:33:33:33 to be associated with both VLANs.
Based on work of Stefan Gula <steweg@gmail.com>
v8: last version of Stefan Gula for Kernel 3.2.1
v9: rework onto linux-next 2014-03-12 by Michael Braun
add MACADDR_SET command, enable to configure mac for source mode
while creating interface
v10:
- reduce indention level
- rename source_list to source_entry
- use aligned 64bit ether address
- use hash_64 instead of addr[5]
v11:
- rebase for 3.14 / linux-next 20.04.2014
v12
- rebase for linux-next 2014-09-25
Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add netpoll support to macvlan devices. Based on the netpoll support in the 802.1q vlan code.
Tested and macvlan could work well with netconsole.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
They are same, so unify them as one; since macvlan is a kind of vlan,
vlan_pcpu_stats should be a proper name for vlan and macvlan.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only used in one file, no need to expose
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since now macvlan and macvtap use the same receive and
forward handlers, we can remove them completely and use
netif_rx and dev_forward_skb() directly.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce helper function macvlan_dev_real_dev which returns the
underlying device of a macvlan device, similar to vlan_dev_real_dev()
for 802.1q VLAN devices.
v2: IFF_MACVLAN flag and equivalent of is_macvlan_dev() were
introduced in the meantime
v3: do BUG() if compiled without macvlan support
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a operations structure that allows a network interface to export
the fact that it supports package forwarding in hardware between
physical interfaces and other mac layer devices assigned to it (such
as macvlans). This operaions structure can be used by virtual mac
devices to bypass software switching so that forwarding can be done
in hardware more efficiently.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the user issues TUNSETOFFLOAD ioctl, macvtap does not do
anything other then to verify arguments. This patch adds
functionality to allow users to actually control offload features.
NETIF_F_GSO and NETIF_F_GRO are always on, but the rest of the
features can be controlled.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds TUNSETQUEUE ioctl to let userspace can temporarily disable or
enable a queue of macvtap. This is used to be compatible at API layer of tuntap
to simplify the userspace to manage the queues. This is done through introducing
a linked list to track all taps while using vlan->taps array to only track
active taps.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Macvtap should be at least compatible with tap, so change the max number to 16.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting up IPv6 addresses on configurations with many macvlans
is not really working, as many multicast messages are dropped.
Add a multicast filter to macvlan to reduce the amount of cloned
skbs and overhead.
Successfully tested with 1024 macvlans on one ethernet device.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds FDB bridge ops to the macvlan device passthru mode.
Additionally a flags field was added and a NOPROMISC bit to
allow users to use passthru mode without the driver calling
dev_set_promiscuity(). The flags field is a u16 placed in a
4 byte hole (consuming 2 bytes) of the macvlan_dev struct.
We want to do this so that the macvlan driver or stack
above the macvlan driver does not have to process every
packet. For the use case where we know all the MAC addresses
of the endstations above us this works well.
This patch is a result of Roopa Prabhu's work. Follow up
patches are needed for VEPA and VEB macvlan modes.
v2: Change from distinct nopromisc mode to a flags field to
configure this. This avoids the tendency to add a new
mode every time we need some slightly different behavior.
v3: fix error in dev_set_promiscuity and add change and get
link attributes for flags.
CC: Roopa Prabhu <roprabhu@cisco.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On systems that create and delete lots of dynamic devices the
31bit linux ifindex fails to fit in the 16bit macvtap minor,
resulting in unusable macvtap devices. I have systems running
automated tests that that hit this condition in just a few days.
Use a linux idr allocator to track which mavtap minor numbers
are available and and to track the association between macvtap
minor numbers and macvtap network devices.
Remove the unnecessary unneccessary check to see if the network
device we have found is indeed a macvtap device. With macvtap
specific data structures it is impossible to find any other
kind of networking device.
Increase the macvtap minor range from 65536 to the full 20 bits
that is supported by linux device numbers. It doesn't solve the
original problem but there is no penalty for a larger minor
device range.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
macvlan is a stacked device, like tunnels. We should use the lockless
mechanism we are using in tunnels and loopback.
This patch completely removes locking in TX path.
tx stat counters are added into existing percpu stat structure, renamed
from rx_stats to pcpu_stats.
Note : this reverts commit 2c11455321 (macvlan: add multiqueue
capability)
Note : rx_errors converted to a 32bit counter, like tx_dropped, since
they dont need 64bit range.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement multiqueue facility for macvtap driver. The idea is that
a macvtap device can be opened multiple times and the fd's can be
used to register eg, as backend for vhost.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Wagner reported OOM symptoms when sending UDP traffic over
a macvtap link to a kvm receiver.
This appears to be caused by the fact that macvtap packet queues
are unlimited in length. This means that if the receiver can't
keep up with the rate of flow, then we will hit OOM. Of course
it gets worse if the OOM killer then decides to kill the receiver.
This patch imposes a cap on the packet queue length, in the same
way as the tuntap driver, using the device TX queue length.
Please note that macvtap currently has no way of giving congestion
notification, that means the software device TX queue cannot be
used and packets will always be dropped once the macvtap driver
queue fills up.
This shouldn't be a great problem for the scenario where macvtap
is used to feed a kvm receiver, as the traffic is most likely
external in origin so congestion notification can't be applied
anyway.
Of course, if anybody decides to complain about guest-to-guest
UDP packet loss down the track, then we may have to revisit this.
Incidentally, this patch also fixes a real memory leak when
macvtap_get_queue fails.
Chris Wright noticed that for this patch to work, we need a
non-zero TX queue length. This patch includes his work to change
the default macvtap TX queue length to 500.
Reported-by: Mark Wagner <mwagner@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use u64_stats_sync infrastructure to implement 64bit stats.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
What this patch does is it removes two receive frame hooks (for bridge and for
macvlan) from __netif_receive_skb. These are replaced them with a single
hook for both. It only supports one hook per device because it makes no
sense to do bridging and macvlan on the same device.
Then a network driver (of virtual netdev like macvlan or bridge) can register
an rx_handler for needed net device.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now there's null check here and also again in the hook. Looking at bridge bits
which are simmilar, port structure is rcu_dereferenced right away in
handle_bridge and passed to hook. Looks nicer.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for passing a macvtap file descriptor into
vhost-net, much like we already do for tun/tap.
Most of the new code is taken from the respective patch
in the tun driver and may get consolidated in the future.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add __percpu sparse annotations to net drivers.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use macvlan with qemu and other tools that require
a tap file descriptor, the macvtap driver adds a small backend
with a character device with the same interface as the tun
driver, with a minimum set of features.
Macvtap interfaces are created in the same way as macvlan
interfaces using ip link, but the netif is just used as a
handle for configuration and accounting, while the data
goes through the chardev. Each macvtap interface has its
own character device, simplifying permission management
significantly over the generic tun/tap driver.
Cc: Patrick McHardy <kaber@trash.net>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: David S. Miller" <davem@davemloft.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Or Gerlitz <ogerlitz@voltaire.com>
Cc: netdev@vger.kernel.org
Cc: bridge@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This makes it possible to hook into the macvlan driver
from another kernel module. In particular, the goal is
to extend it with the macvtap backend that provides
a tun/tap compatible interface directly on the macvlan
device.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the "#ifdef __KERNEL__" tests from unexported header files in
linux/include whose entire contents are wrapped in that preprocessor
test.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>