2014-04-10 20:27:32

by Andrei Vagin

[permalink] [raw]
Subject: [PATCH RFC 0/2] tcp: allow to repair a tcp connections in closing states

Currently connections only in the TCP_ESTABLISHED state can be dumped and
restored. This series allows to restore connections in the FIN_WAIT_1,
FIN_WAIT_2, LAST_ACK, CLOSE_WAIT, CLOSING or TIME_WAIT states.

v2: We decide to not use control message for repairing fin packets in
queues. Because it looks quite tricky. Alexey suggested to restore each
state separately and in this case setsockopt looks more logical.

Andrey Vagin (2):
tcp: allow to enable repair mode for sockets in closing states
tcp: add ability to restore closing states (v2)

include/uapi/linux/tcp.h | 1 +
net/ipv4/tcp.c | 75 ++++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 71 insertions(+), 5 deletions(-)

Cc: "David S. Miller" <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Signed-off-by: Andrey Vagin <[email protected]>


2014-04-10 20:27:35

by Andrei Vagin

[permalink] [raw]
Subject: [PATCH 1/2] tcp: allow to enable repair mode for sockets in closing states

The repair mode is used for dumping state of tcp connections
(sequence numbers, queues, options, etc).

Currently the repair mode can be enalbed only for sockets in the
TCP_ESTABLISHED state. If a socket in another state, its internal
state can not be dumped.

Same time there is no guarantee that a connection won't be in other
states when we are dumping it, thus to be able to dump and restore
such states we need to get rid of CLOSE,ESTABLISHED in-kernel
limitation.

I see nothing wrong to allow enabling of the repair mode for connected
sockets in any states.

Cc: "David S. Miller" <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Signed-off-by: Andrey Vagin <[email protected]>
---
net/ipv4/tcp.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 2c7e326..bcb1d59 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1106,15 +1106,18 @@ int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
}

if (unlikely(tp->repair)) {
+ err = -EINVAL;
+ if (tp->repair_queue == TCP_NO_QUEUE)
+ goto out_err;
+
+ if (sk->sk_state != TCP_ESTABLISHED)
+ goto out_err;
+
if (tp->repair_queue == TCP_RECV_QUEUE) {
copied = tcp_send_rcvq(sk, msg, size);
goto out;
}

- err = -EINVAL;
- if (tp->repair_queue == TCP_NO_QUEUE)
- goto out_err;
-
/* 'common' sending to sendq */
}

@@ -2375,7 +2378,8 @@ void tcp_sock_destruct(struct sock *sk)
static inline bool tcp_can_repair_sock(const struct sock *sk)
{
return ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN) &&
- ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_ESTABLISHED));
+ !((1 << sk->sk_state) & (TCPF_LISTEN |
+ TCPF_SYN_SENT | TCPF_SYN_RECV));
}

static int tcp_repair_options_est(struct tcp_sock *tp,
--
1.9.0

2014-04-10 20:27:33

by Andrei Vagin

[permalink] [raw]
Subject: [PATCH 2/2] tcp: add ability to restore closing states (v2)

This patch adds the TCP_REPAIR_STATE option, which allows to set a
socket state. A socket must be in the repair mode and in the
TCP_ESTABLISHED state.

Here are TCP_FIN_WAIT{1,2}, TCP_WAIT_STOP, TCP_CLOSING, TCP_LAST_ACK,
TCP_TIME_WAIT.

v2: We decide to not use control message for repairing fin packets in
queues. Because it looks quite tricky. Alexey suggested to restore each
state separately and in this case setsockopt looks more logical.

Cc: "David S. Miller" <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Signed-off-by: Andrey Vagin <[email protected]>
---
include/uapi/linux/tcp.h | 1 +
net/ipv4/tcp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 62 insertions(+)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 3b97183..6009062 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -112,6 +112,7 @@ enum {
#define TCP_FASTOPEN 23 /* Enable FastOpen on listeners */
#define TCP_TIMESTAMP 24
#define TCP_NOTSENT_LOWAT 25 /* limit number of unsent bytes in write queue */
+#define TCP_REPAIR_STATE 26 /* Current state of this connection */

struct tcp_repair_opt {
__u32 opt_code;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index bcb1d59..9ded8e8 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2431,6 +2431,60 @@ static int tcp_repair_options_est(struct tcp_sock *tp,
return 0;
}

+static int tcp_repair_state(struct sock *sk, int state)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+
+ if (sk->sk_state != TCP_ESTABLISHED)
+ return -EINVAL;
+
+ switch (state) {
+ case TCP_ESTABLISHED:
+ break;
+
+ case TCP_FIN_WAIT2:
+ if (tp->snd_una != tp->write_seq)
+ return -EINVAL;
+ tcp_set_state(sk, TCP_FIN_WAIT2);
+ break;
+
+ case TCP_TIME_WAIT:
+ if (tp->snd_una != tp->write_seq)
+ return -EINVAL;
+ local_bh_disable();
+ tcp_time_wait(sk, TCP_TIME_WAIT, 0);
+ local_bh_enable();
+ break;
+
+ case TCP_CLOSE_WAIT:
+ tcp_set_state(sk, TCP_CLOSE_WAIT);
+ break;
+
+ case TCP_LAST_ACK:
+ case TCP_FIN_WAIT1:
+ case TCP_CLOSING:
+ tcp_set_state(sk, state);
+ tcp_send_fin(sk);
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ if ((1 << sk->sk_state) & (TCPF_FIN_WAIT1 |
+ TCPF_FIN_WAIT2 |
+ TCPF_CLOSING |
+ TCPF_LAST_ACK))
+ sk->sk_shutdown |= SEND_SHUTDOWN;
+
+ if ((1 << sk->sk_state) & (TCPF_CLOSE_WAIT |
+ TCPF_CLOSING |
+ TCPF_LAST_ACK))
+ sk->sk_shutdown |= RCV_SHUTDOWN;
+
+ return 0;
+}
+
/*
* Socket option code for TCP.
*/
@@ -2568,6 +2622,13 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
err = -EPERM;
break;

+ case TCP_REPAIR_STATE:
+ if (tp->repair)
+ err = tcp_repair_state(sk, val);
+ else
+ err = -EINVAL;
+ break;
+
case TCP_CORK:
/* When set indicates to always queue non-full frames.
* Later the user clears this option and we transmit
--
1.9.0