linux-stable/net/rds
Sowmini Varadhan 1a0e100fb2 RDS: TCP: Force every connection to be initiated by numerically smaller IP address
When 2 RDS peers initiate an RDS-TCP connection simultaneously,
there is a potential for "duelling syns" on either/both sides.
See commit 241b271952 ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") for a description of this
condition, and the arbitration logic which ensures that the
numerically large IP address in the TCP connection is bound to the
RDS_TCP_PORT ("canonical ordering").

The rds_connection should not be marked as RDS_CONN_UP until the
arbitration logic has converged for the following reason. The sender
may start transmitting RDS datagrams as soon as RDS_CONN_UP is set,
and since the sender removes all datagrams from the rds_connection's
cp_retrans queue based on TCP acks. If the TCP ack was sent from
a tcp socket that got reset as part of duel aribitration (but
before data was delivered to the receivers RDS socket layer),
the sender may end up prematurely freeing the datagram, and
the datagram is no longer reliably deliverable.

This patch remedies that condition by making sure that, upon
receipt of 3WH completion state change notification of TCP_ESTABLISHED
in rds_tcp_state_change, we mark the rds_connection as RDS_CONN_UP
if, and only if, the IP addresses and ports for the connection are
canonically ordered. In all other cases, rds_tcp_state_change will
force an rds_conn_path_drop(), and rds_queue_reconnect() on
both peers will restart the connection to ensure canonical ordering.

A side-effect of enforcing this condition in rds_tcp_state_change()
is that rds_tcp_accept_one_path() can now be refactored for simplicity.
It is also no longer possible to encounter an RDS_CONN_UP connection in
the arbitration logic in rds_tcp_accept_one().

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-17 13:35:18 -05:00
..
af_rds.c RDS: TCP: Track peer's connection generation number 2016-11-17 13:35:18 -05:00
bind.c RDS: TCP: Enable multipath RDS for TCP 2016-07-15 11:36:58 -07:00
cong.c RDS: split out connection specific state from rds_connection to rds_conn_path 2016-06-14 23:50:41 -07:00
connection.c RDS: TCP: Force every connection to be initiated by numerically smaller IP address 2016-11-17 13:35:18 -05:00
ib.c IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
ib.h RDS: add __printf format attribute to error reporting functions 2016-08-08 16:16:21 -07:00
ib_cm.c RDS: TCP: Hooks to set up a single connection path 2016-07-01 16:45:17 -04:00
ib_fmr.c RDS: IB: move FMR code to its own file 2016-03-02 14:13:18 -05:00
ib_frmr.c IB/core: Add passing an offset into the SG to ib_map_mr_sg 2016-05-13 13:37:11 -04:00
ib_mr.h RDS: IB: Support Fastreg MR (FRMR) memory registration mode 2016-03-02 14:13:19 -05:00
ib_rdma.c RDS: split out connection specific state from rds_connection to rds_conn_path 2016-06-14 23:50:41 -07:00
ib_recv.c RDS: TCP: make receive path use the rds_conn_path 2016-07-01 16:45:17 -04:00
ib_ring.c
ib_send.c RDS: Rework path specific indirections 2016-07-01 16:45:17 -04:00
ib_stats.c RDS: IB: add mr reused stats 2016-03-02 14:13:19 -05:00
ib_sysctl.c net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
info.c rds: fix an integer overflow test in rds_info_getsockopt() 2015-08-03 15:20:16 -07:00
info.h
Kconfig RDS: Drop stale iWARP RDMA transport 2016-03-02 14:13:17 -05:00
loop.c RDS: TCP: Hooks to set up a single connection path 2016-07-01 16:45:17 -04:00
loop.h
Makefile rds: debug messages are enabled by default 2016-10-29 15:55:57 -04:00
message.c RDS: TCP: Track peer's connection generation number 2016-11-17 13:35:18 -05:00
page.c RDS: memory allocated must be align to 8 2016-04-07 16:58:27 -04:00
rdma.c RDS: Fix rds MR reference count in rds_rdma_unuse() 2015-08-25 16:28:10 -07:00
rdma_transport.c RDS: split out connection specific state from rds_connection to rds_conn_path 2016-06-14 23:50:41 -07:00
rdma_transport.h RDS: Drop stale iWARP RDMA transport 2016-03-02 14:13:17 -05:00
rds.h RDS: TCP: Track peer's connection generation number 2016-11-17 13:35:18 -05:00
rds_single_path.h RDS: split out connection specific state from rds_connection to rds_conn_path 2016-06-14 23:50:41 -07:00
recv.c RDS: TCP: Track peer's connection generation number 2016-11-17 13:35:18 -05:00
send.c RDS: TCP: Track peer's connection generation number 2016-11-17 13:35:18 -05:00
stats.c net/rds: zero last byte for strncpy 2013-03-08 00:35:44 -05:00
sysctl.c net: rds: fix coding style issues 2016-06-18 21:34:09 -07:00
tcp.c RDS: TCP: report addr/port info based on TCP socket in rds-info 2016-11-09 12:47:49 -05:00
tcp.h RDS: TCP: avoid bad page reference in rds_tcp_listen_data_ready 2016-07-15 11:36:57 -07:00
tcp_connect.c RDS: TCP: Force every connection to be initiated by numerically smaller IP address 2016-11-17 13:35:18 -05:00
tcp_listen.c RDS: TCP: Force every connection to be initiated by numerically smaller IP address 2016-11-17 13:35:18 -05:00
tcp_recv.c RDS: TCP: make receive path use the rds_conn_path 2016-07-01 16:45:17 -04:00
tcp_send.c RDS: TCP: set RDS_FLAG_RETRANSMITTED in cp_retrans list 2016-11-17 13:35:18 -05:00
tcp_stats.c
threads.c rds: Remove duplicate prefix from rds_conn_path_error use 2016-10-17 11:07:22 -04:00
transport.c net: rds: fix coding style issues 2016-06-18 21:34:09 -07:00