2023-08-07 14:24:37

by Menglong Dong

[permalink] [raw]
Subject: [PATCH net-next v2 0/3]net: tcp: support probing OOM

From: Menglong Dong <[email protected]>

In this series, we make some small changes to make the tcp retransmission
become zero-window probes if the receiver drops the skb because of memory
pressure.

In the 1st patch, we reply a zero-window ACK if the skb is dropped
because out of memory, instead of dropping the skb silently.

In the 2nd patch, we allow a zero-window ACK to update the window.

In the 3rd patch, we check the timeout of a probing socket with
'(u32)icsk->icsk_timeout', instead of 'tcp_jiffies32'. This is more like
a bugfix.

After these changes, the tcp can probe the OOM of the receiver forever.

Changes since v1:
- send 0 rwin ACK for the receive queue empty case when necessary in the
1st patch
- send the ACK immediately by using the ICSK_ACK_NOW flag in the 1st
patch
- consider the case of the connection restart from idle, as Neal comment,
in the 3rd patch

Menglong Dong (3):
net: tcp: send zero-window ACK when no memory
net: tcp: allow zero-window ACK update the window
net: tcp: fix unexcepted socket die when snd_wnd is 0

include/net/inet_connection_sock.h | 3 ++-
net/ipv4/tcp_input.c | 16 ++++++++++++----
net/ipv4/tcp_output.c | 14 +++++++++++---
net/ipv4/tcp_timer.c | 10 +++++++++-
4 files changed, 34 insertions(+), 9 deletions(-)

--
2.40.1



2023-08-07 15:00:00

by Menglong Dong

[permalink] [raw]
Subject: [PATCH net-next v2 1/3] net: tcp: send zero-window ACK when no memory

From: Menglong Dong <[email protected]>

For now, skb will be dropped when no memory, which makes client keep
retrans util timeout and it's not friendly to the users.

In this patch, we reply an ACK with zero-window in this case to update
the snd_wnd of the sender to 0. Therefore, the sender won't timeout the
connection and will probe the zero-window with the retransmits.

Signed-off-by: Menglong Dong <[email protected]>
---
v2:
- send 0 rwin ACK for the receive queue empty case when necessary
- send the ACK immediately by using the ICSK_ACK_NOW flag
---
include/net/inet_connection_sock.h | 3 ++-
net/ipv4/tcp_input.c | 14 +++++++++++---
net/ipv4/tcp_output.c | 14 +++++++++++---
3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index c2b15f7e5516..be3c858a2ebb 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -164,7 +164,8 @@ enum inet_csk_ack_state_t {
ICSK_ACK_TIMER = 2,
ICSK_ACK_PUSHED = 4,
ICSK_ACK_PUSHED2 = 8,
- ICSK_ACK_NOW = 16 /* Send the next ACK immediately (once) */
+ ICSK_ACK_NOW = 16, /* Send the next ACK immediately (once) */
+ ICSK_ACK_NOMEM = 32,
};

void inet_csk_init_xmit_timers(struct sock *sk,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 8e96ebe373d7..aae485d0a3b6 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5059,12 +5059,20 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)

/* Ok. In sequence. In window. */
queue_and_out:
- if (skb_queue_len(&sk->sk_receive_queue) == 0)
- sk_forced_mem_schedule(sk, skb->truesize);
- else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) {
+ if (skb_queue_len(&sk->sk_receive_queue) == 0) {
+ if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) {
+ sk_forced_mem_schedule(sk, skb->truesize);
+ inet_csk(sk)->icsk_ack.pending |=
+ (ICSK_ACK_NOMEM | ICSK_ACK_NOW);
+ inet_csk_schedule_ack(sk);
+ }
+ } else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) {
reason = SKB_DROP_REASON_PROTO_MEM;
NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP);
sk->sk_data_ready(sk);
+ inet_csk(sk)->icsk_ack.pending |=
+ (ICSK_ACK_NOMEM | ICSK_ACK_NOW);
+ inet_csk_schedule_ack(sk);
goto drop;
}

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index c5412ee77fc8..769a558159ee 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -257,11 +257,19 @@ EXPORT_SYMBOL(tcp_select_initial_window);
static u16 tcp_select_window(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
- u32 old_win = tp->rcv_wnd;
- u32 cur_win = tcp_receive_window(tp);
- u32 new_win = __tcp_select_window(sk);
struct net *net = sock_net(sk);
+ u32 old_win = tp->rcv_wnd;
+ u32 cur_win, new_win;
+
+ /* Make the window 0 if we failed to queue the data because we
+ * are out of memory. The window is temporary, so we don't store
+ * it on the socket.
+ */
+ if (unlikely(inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOMEM))
+ return 0;

+ cur_win = tcp_receive_window(tp);
+ new_win = __tcp_select_window(sk);
if (new_win < cur_win) {
/* Danger Will Robinson!
* Don't update rcv_wup/rcv_wnd here or else
--
2.40.1


2023-08-07 15:22:10

by Menglong Dong

[permalink] [raw]
Subject: [PATCH net-next v2 2/3] net: tcp: allow zero-window ACK update the window

From: Menglong Dong <[email protected]>

Fow now, an ACK can update the window in following case, according to
the tcp_may_update_window():

1. the ACK acknowledged new data
2. the ACK has new data
3. the ACK expand the window and the seq of it is valid

Now, we allow the ACK update the window if the window is 0, and the
seq/ack of it is valid. This is for the case that the receiver replies
an zero-window ACK when it is under memory stress and can't queue the new
data.

Signed-off-by: Menglong Dong <[email protected]>
---
net/ipv4/tcp_input.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index aae485d0a3b6..f17cca362086 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3525,7 +3525,7 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp,
{
return after(ack, tp->snd_una) ||
after(ack_seq, tp->snd_wl1) ||
- (ack_seq == tp->snd_wl1 && nwin > tp->snd_wnd);
+ (ack_seq == tp->snd_wl1 && (nwin > tp->snd_wnd || !nwin));
}

/* If we update tp->snd_una, also update tp->bytes_acked */
--
2.40.1