linux-stable/net/tipc
Tuong Lien d0f84d0856 tipc: fix issues with early FAILOVER_MSG from peer
It appears that a FAILOVER_MSG can come from peer even when the failure
link is resetting (i.e. just after the 'node_write_unlock()'...). This
means the failover procedure on the node has not been started yet.
The situation is as follows:

         node1                                node2
  linkb          linka                  linka        linkb
    |              |                      |            |
    |              |                      x failure    |
    |              |                  RESETTING        |
    |              |                      |            |
    |              x failure            RESET          |
    |          RESETTING             FAILINGOVER       |
    |              |   (FAILOVER_MSG)     |            |
    |<-------------------------------------------------|
    | *FAILINGOVER |                      |            |
    |              | (dummy FAILOVER_MSG) |            |
    |------------------------------------------------->|
    |            RESET                    |            | FAILOVER_END
    |         FAILINGOVER               RESET          |
    .              .                      .            .
    .              .                      .            .
    .              .                      .            .

Once this happens, the link failover procedure will be triggered
wrongly on the receiving node since the node isn't in FAILINGOVER state
but then another link failover will be carried out.
The consequences are:

1) A peer might get stuck in FAILINGOVER state because the 'sync_point'
was set, reset and set incorrectly, the criteria to end the failover
would not be met, it could keep waiting for a message that has already
received.

2) The early FAILOVER_MSG(s) could be queued in the link failover
deferdq but would be purged or not pulled out because the 'drop_point'
was not set correctly.

3) The early FAILOVER_MSG(s) could be dropped too.

4) The dummy FAILOVER_MSG could make the peer leaving FAILINGOVER state
shortly, but later on it would be restarted.

The same situation can also happen when the link is in PEER_RESET state
and a FAILOVER_MSG arrives.

The commit resolves the issues by forcing the link down immediately, so
the failover procedure will be started normally (which is the same as
when receiving a FAILOVER_MSG and the link is in up state).

Also, the function "tipc_node_link_failover()" is toughen to avoid such
a situation from happening.

Acked-by: Jon Maloy <jon.maloy@ericsson.se>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-18 10:03:37 -07:00
..
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Makefile tipc: enable tracepoints in tipc 2018-12-19 11:49:24 -08:00
addr.c tipc: handle collisions of 32-bit node address hash values 2018-03-23 13:12:18 -04:00
addr.h tipc: add 128-bit node identifier 2018-03-23 13:12:18 -04:00
bcast.c tipc: add NULL pointer check 2019-04-04 17:34:11 -07:00
bcast.h tipc: fix a null pointer deref 2019-03-21 09:56:55 -07:00
bearer.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
bearer.h tipc: enable tracepoints in tipc 2018-12-19 11:49:24 -08:00
core.c tipc: fix modprobe tipc failed after switch order of device registration 2019-05-20 10:45:43 -07:00
core.h tipc: introduce new capability flag for cluster 2019-03-19 13:56:17 -07:00
diag.c tipc: switch to rhashtable iterator 2018-08-29 18:04:54 -07:00
discover.c tipc: fix lockdep warning when reinitilaizing sockets 2018-11-17 22:01:31 -08:00
discover.h tipc: some cleanups in the file discover.c 2018-03-23 13:12:17 -04:00
eth_media.c
group.c tipc: purge deferredq list for each grp member in tipc_group_delete 2019-06-16 20:42:05 -07:00
group.h tipc: extend sock diag for group communication 2018-06-30 21:05:42 +09:00
ib_media.c
link.c tipc: fix issues with early FAILOVER_MSG from peer 2019-06-18 10:03:37 -07:00
link.h tipc: fix missing Name entries due to half-failover 2019-05-04 00:59:51 -04:00
monitor.c netlink: make nla_nest_start() add NLA_F_NESTED flag 2019-04-27 17:03:44 -04:00
monitor.h
msg.c tipc: buffer overflow handling in listener socket 2018-09-29 11:24:22 -07:00
msg.h tipc: reduce duplicate packets for unicast traffic 2019-04-04 18:29:25 -07:00
name_distr.c tipc: eliminate message disordering during binding table update 2018-10-22 19:29:12 -07:00
name_distr.h tipc: permit overlapping service ranges in name table 2018-03-31 22:19:52 -04:00
name_table.c netlink: make nla_nest_start() add NLA_F_NESTED flag 2019-04-27 17:03:44 -04:00
name_table.h tipc: eliminate message disordering during binding table update 2018-10-22 19:29:12 -07:00
net.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
net.h tipc: fix lockdep warning when reinitilaizing sockets 2018-11-17 22:01:31 -08:00
netlink.c genetlink: optionally validate strictly/dumps 2019-04-27 17:07:22 -04:00
netlink.h
netlink_compat.c genetlink: optionally validate strictly/dumps 2019-04-27 17:07:22 -04:00
node.c tipc: fix issues with early FAILOVER_MSG from peer 2019-06-18 10:03:37 -07:00
node.h tipc: improve TIPC throughput by Gap ACK blocks 2019-04-04 18:29:25 -07:00
socket.c tipc: fix hanging clients using poll with EPOLLOUT flag 2019-05-09 09:26:09 -07:00
socket.h tipc: add trace_events for tipc socket 2018-12-19 11:49:24 -08:00
subscr.c tipc: fix unbalanced reference counter 2018-04-12 21:46:10 -04:00
subscr.h tipc: fix modprobe tipc failed after switch order of device registration 2019-05-20 10:45:43 -07:00
sysctl.c tipc: set sysctl_tipc_rmem and named_timeout right range 2019-04-16 21:32:02 -07:00
topsrv.c tipc: fix modprobe tipc failed after switch order of device registration 2019-05-20 10:45:43 -07:00
topsrv.h tipc: rename tipc_server to tipc_topsrv 2018-02-16 15:26:34 -05:00
trace.c tipc: remove unneeded semicolon in trace.c 2019-01-17 22:04:43 -08:00
trace.h tipc: add trace_events for tipc bearer 2018-12-19 11:49:25 -08:00
udp_media.c netlink: make validation more configurable for future strictness 2019-04-27 17:07:21 -04:00
udp_media.h tipc: implement configuration of UDP media MTU 2018-04-20 11:04:05 -04:00