Commit graph

50 commits

Author SHA1 Message Date
Wei Liu
33bc801ddd Revert "xen-netback: improve ring effeciency for guest RX"
This reverts commit 4f0581d258.

The named changeset is causing problem. Let's aim to make this part less
fragile before trying to improve things.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Matt Wilson <msw@amazon.com>
Cc: Xi Xiong <xixiong@amazon.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-08 15:10:48 -04:00
Wei Liu
4f0581d258 xen-netback: improve ring effeciency for guest RX
There was a bug that netback routines netbk/xenvif_skb_count_slots and
netbk/xenvif_gop_frag_copy disagreed with each other, which caused
netback to push wrong number of responses to netfront, which caused
netfront to eventually crash. The bug was fixed in 6e43fc04a
("xen-netback: count number required slots for an skb more carefully").

Commit 6e43fc04a focused on backport-ability. The drawback with the
existing packing scheme is that the ring is not used effeciently, as
stated in 6e43fc04a.

skb->data like:
    |        1111|222222222222|3333        |

is arranged as:
    |1111        |222222222222|3333        |

If we can do this:
    |111122222222|22223333    |
That would save one ring slot, which improves ring effeciency.

This patch effectively reverts 6e43fc04a. That patch made count_slots
agree with gop_frag_copy, while this patch goes the other way around --
make gop_frag_copy agree with count_slots. The end result is that they
still agree with each other, and the ring is now arranged like:
    |111122222222|22223333    |

The patch that improves packing was first posted by Xi Xong and Matt
Wilson. I only rebase it on top of net-next and rewrite commit message,
so I retain all their SoBs. For more infomation about the original bug
please refer to email listed below and commit message of 6e43fc04a.

Original patch:
http://lists.xen.org/archives/html/xen-devel/2013-07/msg00760.html

Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[ msw: minor code cleanups, rewrote commit message, adjusted code
  to count RX slots instead of meta structures ]
Signed-off-by: Matt Wilson <msw@amazon.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
[ liuw: rebased on top of net-next tree, rewrote commit message, coding
  style cleanup. ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-30 19:14:11 -04:00
David Vrabel
6e43fc04a6 xen-netback: count number required slots for an skb more carefully
When a VM is providing an iSCSI target and the LUN is used by the
backend domain, the generated skbs for direct I/O writes to the disk
have large, multi-page skb->data but no frags.

With some lengths and starting offsets, xen_netbk_count_skb_slots()
would be one short because the simple calculation of
DIV_ROUND_UP(skb_headlen(), PAGE_SIZE) was not accounting for the
decisions made by start_new_rx_buffer() which does not guarantee
responses are fully packed.

For example, a skb with length < 2 pages but which spans 3 pages would
be counted as requiring 2 slots but would actually use 3 slots.

skb->data:

    |        1111|222222222222|3333        |

Fully packed, this would need 2 slots:

    |111122222222|22223333    |

But because the 2nd page wholy fits into a slot it is not split across
slots and goes into a slot of its own:

    |1111        |222222222222|3333        |

Miscounting the number of slots means netback may push more responses
than the number of available requests.  This will cause the frontend
to get very confused and report "Too many frags/slots".  The frontend
never recovers and will eventually BUG.

Fix this by counting the number of required slots more carefully.  In
xen_netbk_count_skb_slots(), more closely follow the algorithm used by
xen_netbk_gop_skb() by introducing xen_netbk_count_frag_slots() which
is the dry-run equivalent of netbk_gop_frag_copy().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-12 23:22:13 -04:00
Wei Liu
7376419a46 xen-netback: rename functions
As we move to 1:1 model and melt xen_netbk and xenvif together, it would
be better to use single prefix for all functions in xen-netback.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-29 01:18:04 -04:00
Wei Liu
b3f980bd82 xen-netback: switch to NAPI + kthread 1:1 model
This patch implements 1:1 model netback. NAPI and kthread are utilized
to do the weight-lifting job:

- NAPI is used for guest side TX (host side RX)
- kthread is used for guest side RX (host side TX)

Xenvif and xen_netbk are made into one structure to reduce code size.

This model provides better scheduling fairness among vifs. It is also
prerequisite for implementing multiqueue for Xen netback.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-29 01:18:04 -04:00
Wei Liu
43e9d19432 xen-netback: remove page tracking facility
The data flow from DomU to DomU on the same host in current copying
scheme with tracking facility:

       copy
DomU --------> Dom0          DomU
 |                            ^
 |____________________________|
             copy

The page in Dom0 is a page with valid MFN. So we can always copy from
page Dom0, thus removing the need for a tracking facility.

       copy           copy
DomU --------> Dom0 -------> DomU

Simple iperf test shows no performance regression (obviously we copy
twice either way):

  W/  tracking: ~5.3Gb/s
  W/o tracking: ~5.4Gb/s

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Matt Wilson <msw@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-29 01:18:04 -04:00
Joe Perches
383eda32b8 xen: Use more current logging styles
Instead of mixing printk and pr_<level> forms,
just use pr_<level>

Miscellaneous changes around these conversions:

Add a missing newline to avoid message interleaving,
coalesce formats, reflow modified lines to 80 columns.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 13:31:25 -07:00
Dan Carpenter
07cc61bfc0 xen-netback: double free on unload
There is a typo here, "i" vs "j", so we would crash on module_exit().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 00:24:57 -07:00
David S. Miller
d98cae64e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	drivers/net/xen-netback/netback.c
	net/batman-adv/bat_iv_ogm.c
	net/wireless/nl80211.c

The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.

The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().

Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.

The nl80211 conflict is a little more involved.  In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported.  Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.

However, the dump handlers to not use this logic.  Instead they have
to explicitly do the locking.  There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so.  So I fixed that whilst handling the overlapping changes.

To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 16:49:39 -07:00
Jan Beulich
94f950c406 xen-netback: don't de-reference vif pointer after having called xenvif_put()
When putting vif-s on the rx notify list, calling xenvif_put() must be
deferred until after the removal from the list and the issuing of the
notification, as both operations dereference the pointer.

Changing this got me to notice that the "irq" variable was effectively
unused (and was of too narrow type anyway).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 01:25:24 -07:00
Wei Liu
e1f00a69ec xen-netback: split event channels support for Xen backend driver
Netback and netfront only use one event channel to do TX / RX notification,
which may cause unnecessary wake-up of processing routines. This patch adds a
new feature called feature-split-event-channels to netback, enabling it to
handle TX and RX events separately.

Netback will use tx_irq to notify guest for TX completion, rx_irq for RX
notification.

If frontend doesn't support this feature, tx_irq equals to rx_irq.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-23 18:40:37 -07:00
Wei Liu
b103f358d9 xen-netback: enable user to unload netback module
This patch enables user to unload netback module, which is useful when user
wants to upgrade to a newer netback module without rebooting the host.

Netfront cannot handle netback removal event. As we cannot fix all possible
frontends we add module get / put along with vif get / put to avoid
mis-unloading of netback. To unload netback module, user needs to shutdown all
VMs or migrate them to another host or unplug all vifs before hand.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>¬
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-17 18:23:07 -07:00
Wei Liu
f1db320ec5 xen-netback: remove dead code
The array mmap_pages is never touched in the initialization function. This is
remnant of mapping mechanism, which does not exist upstream. In current
upstream code this array only tracks usage of pages inside netback. Those
pages are allocated when contructing a SKB and passed directly to network
subsystem.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-17 18:23:07 -07:00
Wei Liu
376414945d xen-netback: better names for thresholds
This patch only changes some names to avoid confusion.

In this patch we have:

  MAX_SKB_SLOTS_DEFAULT -> FATAL_SKB_SLOTS_DEFAULT
  max_skb_slots -> fatal_skb_slots
  #define XEN_NETBK_LEGACY_SLOTS_MAX XEN_NETIF_NR_SLOTS_MIN

The fatal_skb_slots is the threshold to determine whether a packet is
malicious.

XEN_NETBK_LEGACY_SLOTS_MAX is the maximum slots a valid packet can have at
this point. It is defined to be XEN_NETIF_NR_SLOTS_MIN because that's
guaranteed to be supported by all backends.

Suggested-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-02 16:50:08 -04:00
Wei Liu
59ccb4ebbc xen-netback: avoid allocating variable size array on stack
Tune xen_netbk_count_requests to not touch working array beyond limit, so that
we can make working array size constant.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-02 16:50:08 -04:00
Wei Liu
ac69c26e7a xen-netback: remove redundent parameter in netbk_count_requests
Tracking down from the caller, first_idx is always equal to vif->tx.req_cons.
Remove it to avoid confusion.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-02 16:50:08 -04:00
Wei Liu
03393fd5cc xen-netback: don't disconnect frontend when seeing oversize packet
Some frontend drivers are sending packets > 64 KiB in length. This length
overflows the length field in the first slot making the following slots have
an invalid length.

Turn this error back into a non-fatal error by dropping the packet. To avoid
having the following slots having fatal errors, consume all slots in the
packet.

This does not reopen the security hole in XSA-39 as if the packet as an
invalid number of slots it will still hit fatal error case.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22 15:37:01 -04:00
Wei Liu
2810e5b9a7 xen-netback: coalesce slots in TX path and fix regressions
This patch tries to coalesce tx requests when constructing grant copy
structures. It enables netback to deal with situation when frontend's
MAX_SKB_FRAGS is larger than backend's MAX_SKB_FRAGS.

With the help of coalescing, this patch tries to address two regressions
avoid reopening the security hole in XSA-39.

Regression 1. The reduction of the number of supported ring entries (slots)
per packet (from 18 to 17). This regression has been around for some time but
remains unnoticed until XSA-39 security fix. This is fixed by coalescing
slots.

Regression 2. The XSA-39 security fix turning "too many frags" errors from
just dropping the packet to a fatal error and disabling the VIF. This is fixed
by coalescing slots (handling 18 slots when backend's MAX_SKB_FRAGS is 17)
which rules out false positive (using 18 slots is legit) and dropping packets
using 19 to `max_skb_slots` slots.

To avoid reopening security hole in XSA-39, frontend sending packet using more
than max_skb_slots is considered malicious.

The behavior of netback for packet is thus:

    1-18            slots: valid
   19-max_skb_slots slots: drop and respond with an error
   max_skb_slots+   slots: fatal error

max_skb_slots is configurable by admin, default value is 20.

Also change variable name from "frags" to "slots" in netbk_count_requests.

Please note that RX path still has dependency on MAX_SKB_FRAGS. This will be
fixed with separate patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-22 15:37:01 -04:00
Jason Wang
bea8933647 xen-netback: switch to use skb_partial_csum_set()
Switch to use skb_partial_csum_set() to simplify the codes.

Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-12 14:58:33 -04:00
stephen hemminger
9eaee8beee xen-netback: fix sparse warning
Fix warning about 0 used as NULL.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-10 23:23:42 -04:00
Jason Wang
40893fd0fd net: switch to use skb_probe_transport_header()
Switch to use the new help skb_probe_transport_header() to do the l4 header
probing for untrusted sources. For packets with partial csum, the header should
already been set by skb_partial_csum_set().

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-27 12:48:31 -04:00
Jason Wang
f9ca8f7439 netback: set transport header before passing it to kernel
Currently, for the packets receives from netback, before doing header check,
kernel just reset the transport header in netif_receive_skb() which pretends non
l4 header. This is suboptimal for precise packet length estimation (introduced
in 1def9238: net_sched: more precise pkt_len computation) which needs correct l4
header for gso packets.

The patch just reuse the header probed by netback for partial checksum packets
and tries to use skb_flow_dissect() for other cases, if both fail, just pretend
no l4 header.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-26 12:44:44 -04:00
Wei Liu
27f852282a xen-netback: remove skb in xen_netbk_alloc_page
This variable is never used.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-25 12:19:47 -04:00
David S. Miller
629821d9b0 Revert "xen: netback: remove redundant xenvif_put"
This reverts commit d37204566a.

This change is incorrect, as per Jan Beulich:

====================
But this is wrong from all we can tell, we discussed this before
(Wei pointed to the discussion in an earlier reply). The core of
it is that the put here parallels the one in netbk_tx_err(), and
the one in xenvif_carrier_off() matches the get from
xenvif_connect() (which normally would be done on the path
coming through xenvif_disconnect()).
====================

And a previous discussion of this issue is at:

http://marc.info/?l=xen-devel&m=136084174026977&w=2

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 13:04:34 -05:00
Andrew Jones
d37204566a xen: netback: remove redundant xenvif_put
netbk_fatal_tx_err() calls xenvif_carrier_off(), which does
a xenvif_put(). As callers of netbk_fatal_tx_err should only
have one reference to the vif at this time, then the xenvif_put
in netbk_fatal_tx_err is one too many.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 00:51:09 -05:00
David Vrabel
35876b5ffc xen-netback: correctly return errors from netbk_count_requests()
netbk_count_requests() could detect an error, call
netbk_fatal_tx_error() but return 0.  The vif may then be used
afterwards (e.g., in a call to netbk_tx_error().

Since netbk_fatal_tx_error() could set vif->refcnt to 1, the vif may
be freed immediately after the call to netbk_fatal_tx_error() (e.g.,
if the vif is also removed).

Netback thread              Xenwatch thread
-------------------------------------------
netbk_fatal_tx_err()        netback_remove()
                              xenvif_disconnect()
                                ...
                                free_netdev()
netbk_tx_err() Oops!

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Christopher S. Aker <caker@theshore.net>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-14 13:16:49 -05:00
Ian Campbell
b9149729eb netback: correct netbk_tx_err to handle wrap around.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:29:29 -05:00
Ian Campbell
4cc7c1cb7b xen/netback: free already allocated memory on failure in xen_netbk_get_requests
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:29:28 -05:00
Matthew Daley
7d5145d8eb xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:29:28 -05:00
Ian Campbell
48856286b6 xen/netback: shutdown the ring if it contains garbage.
A buggy or malicious frontend should not be able to confuse netback.
If we spot anything which is not as it should be then shutdown the
device and don't try to continue with the ring in a potentially
hostile state. Well behaved and non-hostile frontends will not be
penalised.

As well as making the existing checks for such errors fatal also add a
new check that ensures that there isn't an insane number of requests
on the ring (i.e. more than would fit in the ring). If the ring
contains garbage then previously is was possible to loop over this
insane number, getting an error each time and therefore not generating
any more pending requests and therefore not exiting the loop in
xen_netbk_tx_build_gops for an externded period.

Also turn various netdev_dbg calls which no precipitate a fatal error
into netdev_err, they are rate limited because the device is shutdown
afterwards.

This fixes at least one known DoS/softlockup of the backend domain.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:29:28 -05:00
Ian Campbell
6a8ed462f1 xen: netback: handle compound page fragments on transmit.
An SKB paged fragment can consist of a compound page with order > 0.
However the netchannel protocol deals only in PAGE_SIZE frames.

Handle this in netbk_gop_frag_copy and xen_netbk_count_skb_slots by
iterating over the frames which make up the page.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-10 22:50:45 -04:00
Konrad Rzeszutek Wilk
ae1659ee6b Merge branch 'xenarm-for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm into stable/for-linus-3.7
* 'xenarm-for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
  arm: introduce a DTS for Xen unprivileged virtual machines
  MAINTAINERS: add myself as Xen ARM maintainer
  xen/arm: compile netback
  xen/arm: compile blkfront and blkback
  xen/arm: implement alloc/free_xenballooned_pages with alloc_pages/kfree
  xen/arm: receive Xen events on ARM
  xen/arm: initialize grant_table on ARM
  xen/arm: get privilege status
  xen/arm: introduce CONFIG_XEN on ARM
  xen: do not compile manage, balloon, pci, acpi, pcpu and cpu_hotplug on ARM
  xen/arm: Introduce xen_ulong_t for unsigned long
  xen/arm: Xen detection and shared_info page mapping
  docs: Xen ARM DT bindings
  xen/arm: empty implementation of grant_table arch specific functions
  xen/arm: sync_bitops
  xen/arm: page.h definitions
  xen/arm: hypercalls
  arm: initial Xen support

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-26 16:43:35 -04:00
Andres Lagar-Cavilla
c571898ffc xen/gndev: Xen backend support for paged out grant targets V4.
Since Xen-4.2, hvm domains may have portions of their memory paged out. When a
foreign domain (such as dom0) attempts to map these frames, the map will
initially fail. The hypervisor returns a suitable errno, and kicks an
asynchronous page-in operation carried out by a helper. The foreign domain is
expected to retry the mapping operation until it eventually succeeds. The
foreign domain is not put to sleep because itself could be the one running the
pager assist (typical scenario for dom0).

This patch adds support for this mechanism for backend drivers using grant
mapping and copying operations. Specifically, this covers the blkback and
gntdev drivers (which map foreign grants), and the netback driver (which copies
foreign grants).

* Add a retry method for grants that fail with GNTST_eagain (i.e. because the
  target foreign frame is paged out).
* Insert hooks with appropriate wrappers in the aforementioned drivers.

The retry loop is only invoked if the grant operation status is GNTST_eagain.
It guarantees to leave a new status code different from GNTST_eagain. Any other
status code results in identical code execution as before.

The retry loop performs 256 attempts with increasing time intervals through a
32 second period. It uses msleep to yield while waiting for the next retry.

V2 after feedback from David Vrabel:
* Explicit MAX_DELAY instead of wrap-around delay into zero
* Abstract GNTST_eagain check into core grant table code for netback module.

V3 after feedback from Ian Campbell:
* Add placeholder in array of grant table error descriptions for unrelated
  error code we jump over.
* Eliminate single map and retry macro in favor of a generic batch flavor.
* Some renaming.
* Bury most implementation in grant_table.c, cleaner interface.

V4 rebased on top of sync of Xen grant table interface headers.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[v5: Fixed whitespace issues]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-21 09:23:51 -04:00
Stefano Stabellini
ca98163376 xen/arm: compile netback
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-08-08 17:21:23 +00:00
Annie Li
1e0b6eac6a xen/netback: only non-freed SKB is queued into tx_queue
After SKB is queued into tx_queue, it will be freed if request_gop is NULL.
However, no dequeue action is called in this situation, it is likely that
tx_queue constains freed SKB. This patch should fix this issue, and it is
based on 3.5.0-rc4+.

This issue is found through code inspection, no bug is seen with it currently.
I run netperf test for several hours, and no network regression was found.

Signed-off-by: Annie Li <annie.li@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-29 00:50:20 -07:00
Simon Graham
e26b203ede xen/netback: Calculate the number of SKB slots required correctly
When calculating the number of slots required for a packet header, the code
was reserving too many slots if the header crossed a page boundary. Since
netbk_gop_skb copies the header to the start of the page, the count of
slots required for the header should be based solely on the header size.

This problem is easy to reproduce if a VIF is bridged to a USB 3G modem
device as the skb->data value always starts near the end of the first page.

Signed-off-by: Simon Graham <simon.graham@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-24 16:22:53 -04:00
Joe Perches
e404decb0f drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages
alloc failures use dump_stack so emitting an additional
out-of-memory message is an unnecessary duplication.

Remove the allocation failure messages.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-31 16:20:21 -05:00
Linus Torvalds
90160371b3 Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
  xen/pciback: Expand the warning message to include domain id.
  xen/pciback: Fix "device has been assigned to X domain!" warning
  xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
  xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
  xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
  xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
  Xen: consolidate and simplify struct xenbus_driver instantiation
  xen-gntalloc: introduce missing kfree
  xen/xenbus: Fix compile error - missing header for xen_initial_domain()
  xen/netback: Enable netback on HVM guests
  xen/grant-table: Support mappings required by blkback
  xenbus: Use grant-table wrapper functions
  xenbus: Support HVM backends
  xen/xenbus-frontend: Fix compile error with randconfig
  xen/xenbus-frontend: Make error message more clear
  xen/privcmd: Remove unused support for arch specific privcmp mmap
  xen: Add xenbus_backend device
  xen: Add xenbus device driver
  xen: Add privcmd device driver
  xen/gntalloc: fix reference counts on multi-page mappings
  ...
2012-01-10 10:09:59 -08:00
Daniel De Graaf
2a14b24406 xen/netback: Enable netback on HVM guests
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-20 17:07:50 -05:00
David S. Miller
959327c784 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-06 21:10:05 -05:00
Wei Liu
e34c0246d6 netback: fix typo in comment
"variables a used" should be "variables are used".

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:20:56 -05:00
Wei Liu
16ecba2605 netback: remove redundant assignment
New value for netbk->mmap_pages[pending_idx] is assigned in
xen_netbk_alloc_page(), no need for a second assignment which
exposes internal to the outside world.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 13:14:24 -05:00
Wei Liu
6b84bd1674 netback: Fix alert message.
The original message in netback_init was 'kthread_run() fails', which should be
'kthread_create() fails'.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-06 00:32:39 -05:00
Jan Beulich
5ccb3ea720 xen-netback: use correct index for invalidation in xen_netbk_tx_check_gop()
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-21 15:32:15 -05:00
Linus Torvalds
06d381484f Merge branch 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  net: xen-netback: use API provided by xenbus module to map rings
  block: xen-blkback: use API provided by xenbus module to map rings
  xen: use generic functions instead of xen_{alloc, free}_vm_area()
2011-11-06 18:31:36 -08:00
David Vrabel
c9d6369978 net: xen-netback: use API provided by xenbus module to map rings
The xenbus module provides xenbus_map_ring_valloc() and
xenbus_map_ring_vfree().  Use these to map the Tx and Rx ring pages
granted by the frontend.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-26 10:02:56 -04:00
Eric Dumazet
9e903e0852 net: add skb frag size accessors
To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:10:46 -04:00
Ian Campbell
ea066ad158 xen: netback: convert to SKB paged frag API.
netback currently uses frag->page to store a temporary index reference while
processing incoming requests. Since frag->page is to become opaque switch
instead to using page_offset. Add a wrapper to tidy this up and propagate the
fact that the indexes are only u16 through the code (this was already true in
practice but unsigned long and in were inconsistently used as variable and
parameter types)

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: xen-devel@lists.xensource.com
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-05 17:36:00 -04:00
Bastian Blank
f984cec64a xen/netback: Add module alias for autoloading
Add xen-backend:vif module alias to the xen-netback module. This allows
automatic loading of the module.

Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-30 11:19:09 -07:00
Ian Campbell
f942dc2552 xen network backend driver
netback is the host side counterpart to the frontend driver in
drivers/net/xen-netfront.c. The PV protocol is also implemented by
frontend drivers in other OSes too, such as the BSDs and even Windows.

The patch is based on the driver from the xen.git pvops kernel tree but
has been put through the checkpatch.pl wringer plus several manual
cleanup passes and review iterations. The driver has been moved from
drivers/xen/netback to drivers/net/xen-netback.

One major change from xen.git is that the guest transmit path (i.e. what
looks like receive to netback) has been significantly reworked to remove
the dependency on the out of tree PageForeign page flag (a core kernel
patch which enables a per page destructor callback on the final
put_page). This page flag was used in order to implement a grant map
based transmit path (where guest pages are mapped directly into SKB
frags). Instead this version of netback uses grant copy operations into
regular memory belonging to the backend domain. Reinstating the grant
map functionality is something which I would like to revisit in the
future.

Note that this driver depends on 2e820f58f7 "xen/irq: implement
bind_interdomain_evtchn_to_irqhandler for backend drivers" which is in
linux next via the "xen-two" tree and is intended for the 2.6.39 merge
window:
        git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/backends
this branch has only that single commit since 2.6.38-rc2 and is safe for
cross merging into the net branch.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15 19:38:03 -07:00