linux-stable/net/sunrpc/xprtrdma
Chuck Lever cb0ae1fbb2 xprtrdma: Do not update {head, tail}.iov_len in rpcrdma_inline_fixup()
While trying NFSv4.0/RDMA with sec=krb5p, I noticed small NFS READ
operations failed. After the client unwrapped the NFS READ reply
message, the NFS READ XDR decoder was not able to decode the reply.
The message was "Server cheating in reply", with the reported
number of received payload bytes being zero. Applications reported
a read(2) that returned -1/EIO.

The problem is rpcrdma_inline_fixup() sets the tail.iov_len to zero
when the incoming reply fits entirely in the head iovec. The zero
tail.iov_len confused xdr_buf_trim(), which then mangled the actual
reply data instead of simply removing the trailing GSS checksum.

As near as I can tell, RPC transports are not supposed to update the
head.iov_len, page_len, or tail.iov_len fields in the receive XDR
buffer when handling an incoming RPC reply message. These fields
contain the length of each component of the XDR buffer, and hence
the maximum number of bytes of reply data that can be stored in each
XDR buffer component. I've concluded this because:

- This is how xdr_partial_copy_from_skb() appears to behave
- rpcrdma_inline_fixup() already does not alter page_len
- call_decode() compares rq_private_buf and rq_rcv_buf and WARNs
   if they are not exactly the same

Unfortunately, as soon as I tried the simple fix to just remove the
line that sets tail.iov_len to zero, I saw that the logic that
appends the implicit Write chunk pad inline depends on inline_fixup
setting tail.iov_len to zero.

To address this, re-organize the tail iovec handling logic to use
the same approach as with the head iovec: simply point tail.iov_base
to the correct bytes in the receive buffer.

While I remember all this, write down the conclusion in documenting
comments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-07-11 15:50:43 -04:00
..
backchannel.c sunrpc: Advertise maximum backchannel payload size 2016-05-17 15:47:57 -04:00
fmr_ops.c xprtrdma: Place registered MWs on a per-req list 2016-07-11 15:50:43 -04:00
frwr_ops.c xprtrdma: Place registered MWs on a per-req list 2016-07-11 15:50:43 -04:00
Makefile xprtrdma: Remove ALLPHYSICAL memory registration mode 2016-07-11 15:50:43 -04:00
module.c rpcrdma: Merge svcrdma and xprtrdma modules into one 2015-06-04 16:56:02 -04:00
rpc_rdma.c xprtrdma: Do not update {head, tail}.iov_len in rpcrdma_inline_fixup() 2016-07-11 15:50:43 -04:00
svc_rdma.c svcrdma: Define maximum number of backchannel requests 2016-01-19 15:30:48 -05:00
svc_rdma_backchannel.c svcrdma: Use new CQ API for RPC-over-RDMA server send CQs 2016-03-01 13:06:43 -08:00
svc_rdma_marshal.c svcrdma: Generalize svc_rdma_xdr_decode_req() 2016-05-13 15:53:06 -04:00
svc_rdma_recvfrom.c A very quiet cycle for nfsd, mainly just an RDMA update from Chuck Lever. 2016-05-24 14:39:20 -07:00
svc_rdma_sendto.c svcrdma: svc_rdma_put_context() is invoked twice in Send error path 2016-05-13 15:53:05 -04:00
svc_rdma_transport.c svcrdma: Drain QP before freeing svcrdma_xprt 2016-05-13 15:53:06 -04:00
transport.c xprtrdma: Place registered MWs on a per-req list 2016-07-11 15:50:43 -04:00
verbs.c xprtrdma: Place registered MWs on a per-req list 2016-07-11 15:50:43 -04:00
xprt_rdma.h xprtrdma: Chunk list encoders no longer share one rl_segments array 2016-07-11 15:50:43 -04:00