Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp522837ybg; Wed, 3 Jun 2020 06:59:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyZQQ5g5kLT09kcIIkaVtCIZYZymEvoG/W+WBEmzhqg/Vda7kmlAEs738gQ7LQUod7mgAxg X-Received: by 2002:a17:906:9381:: with SMTP id l1mr28990287ejx.380.1591192741776; Wed, 03 Jun 2020 06:59:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591192741; cv=none; d=google.com; s=arc-20160816; b=y5jRTjM0n7zjIXGUsasS/CumacsF1EiNa0hkvzpks0vSd120+DeOv/0S3Q4VbUvdHN V3kUPR8bo+HveqnMtDXLHADpVzu8d3NTRTx4u1X/cGAnS45uPsNyCEMpbndBvWNqju4y EFl1RSmydTvlDpFRQ6fSm6gnIJQN7Uqjj2V2qZh7E0wyVssU/cyly3kitKUuDtz/CtFI UfTAzty2GRZ4I9lfxl/LJKXvnxdtMt3m+jW9SYzv6aDBtQMB8RAPoMCoEqCxr8ftkvnB 3Rs8hZwn+BqR+kuGRlFos0lvaig80Um6Go6A5pUIEs3OouqrN0GwUNq1WBYatQF2oz4E aCZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=U8IeDiDoBVmA0QE3oVgedl+zceUqh62Dw2hpb3UyemU=; b=tMXxE3mT5GvaKL4W4IkxKkm8ZKpEM2R6ebtHWOTSDxuftLNK1RkFHw9pXc7jogvLwN RKxCbFerwU/UPrtYNO1Odnlq33qwFeC1S1I/5L6Vn5MW89Q+EAlsdu0dmPgPlBuIXlAY 1xhNi5r4o0VF6BzQ1ffHcGYfqDjcFukEcDssP03cm6DLAlBooximYh/EYBY7E4eTs7NE U30ige93CRbFtUcie8Dd3jjx0C0wmlYut9VxiEMaeCtDhqcYBc4dfy6EWLQdsitMb3w5 LHIj3ndBJDON5iUku4MfpSGYICRXDzZKNHZcNsrHTmylBjDJMIGXFN52Jckp3ID12Z3v OY7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="oqrUFNi/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lh17si1117821ejb.672.2020.06.03.06.58.36; Wed, 03 Jun 2020 06:59:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="oqrUFNi/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725995AbgFCNzK (ORCPT + 99 others); Wed, 3 Jun 2020 09:55:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725916AbgFCNzJ (ORCPT ); Wed, 3 Jun 2020 09:55:09 -0400 Received: from mail-yb1-xb41.google.com (mail-yb1-xb41.google.com [IPv6:2607:f8b0:4864:20::b41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AA80C08C5C1 for ; Wed, 3 Jun 2020 06:55:09 -0700 (PDT) Received: by mail-yb1-xb41.google.com with SMTP id u17so1121860ybi.0 for ; Wed, 03 Jun 2020 06:55:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=U8IeDiDoBVmA0QE3oVgedl+zceUqh62Dw2hpb3UyemU=; b=oqrUFNi/IKLm8zr/JkoLEWltOameVg+otBORTPeyDT6K+1vL0RDgZ8MIs6JKhZNwiR z/D8d1I15Y7tEVTunSpxsi7kWcXtbI6Vl8jBO4BTixPuDUkfVBiKGGyQdiOFGnJ+GZfU Q36XNbxDsRzFvH26Hf9zP5Lwb7NrNCsohYHmmBUT+WSCWfhXaroWcs/tDjEn77mte5c4 AKhellf9UepiS6Kq2dFe0MMyIUi2jOHBgg3EKbXsOzIhPWfBN8iqEM3L4YPODQ0pU/OR WX7gbyuqBzgHs6Qv0X45wC3efKl40NxhMzMBjvy6bCjibf3VOZWvtCogXFAevrgRU84o +UtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=U8IeDiDoBVmA0QE3oVgedl+zceUqh62Dw2hpb3UyemU=; b=uBwMWAzpZVdHR0cmb8h+dZjVUAvI6ENJNrTLMWnqNMCt2iUszcqqPAWovIzMDVGvSc m1oOAB/TZFRtlAlfs2tBglL+rUU+hWYCq+qq6iBWJYe9UiWrVT2saW5VHcXvMmnu4gsg mJbggXpzbEFq0io1GjKNrZ6//IkpfbUM208RP9S+nrw4BhMYSAcDupCL+LGv8a46Zee9 uvyHVJPd+w2/iNurwNayKjJXze7W8CrfcKahX/IIg0o/Ki57n/s7QPD8/Rl6Dg5/f+01 OP4pVIbbHyEpixjl3sWJiAY/FRlqx23lVyBUB4dZPkrriC1swtwZ3SmtLSQMsQNI48z0 txlA== X-Gm-Message-State: AOAM530e5SZLn3vK5BeH72ByS7dKPP1Wxo0GuN/XyV69iUMZkSCvJEcF R7bLl2/D2n+ITJPC8rAEeBEImEE41w/HFdVCxIryGg== X-Received: by 2002:a25:9a49:: with SMTP id r9mr1089ybo.520.1591192508483; Wed, 03 Jun 2020 06:55:08 -0700 (PDT) MIME-Version: 1.0 References: <20200602080425.93712-1-kerneljasonxing@gmail.com> In-Reply-To: From: Eric Dumazet Date: Wed, 3 Jun 2020 06:54:56 -0700 Message-ID: Subject: Re: [PATCH] tcp: fix TCP socks unreleased in BBR mode To: Neal Cardwell Cc: Jason Xing , David Miller , Alexey Kuznetsov , Hideaki YOSHIFUJI , netdev , LKML , liweishi@kuaishou.com, Shujin Li Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 3, 2020 at 5:02 AM Neal Cardwell wrote: > > On Wed, Jun 3, 2020 at 1:44 AM Eric Dumazet wrote: > > > > On Tue, Jun 2, 2020 at 10:05 PM Jason Xing wrote: > > > > > > Hi Eric, > > > > > > I'm still trying to understand what you're saying before. Would this > > > be better as following: > > > 1) discard the tcp_internal_pacing() function. > > > 2) remove where the tcp_internal_pacing() is called in the > > > __tcp_transmit_skb() function. > > > > > > If we do so, we could avoid 'too late to give up pacing'. Meanwhile, > > > should we introduce the tcp_wstamp_ns socket field as commit > > > (864e5c090749) does? > > > > > > > Please do not top-post on netdev mailing list. > > > > > > I basically suggested double-checking which point in TCP could end up > > calling tcp_internal_pacing() > > while the timer was already armed. > > > > I guess this is mtu probing. > > Perhaps this could also happen from some of the retransmission code > paths that don't use tcp_xmit_retransmit_queue()? Perhaps > tcp_retransmit_timer() (RTO) and tcp_send_loss_probe() TLP? It seems > they could indirectly cause a call to __tcp_transmit_skb() and thus > tcp_internal_pacing() without first checking if the pacing timer was > already armed? I feared this, (see recent commits about very low pacing rates) :/ I am not sure we need to properly fix all these points for old kernels, since EDT model got rid of these problems. Maybe we can try to extend the timer. Something like : diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index cc4ba42052c21b206850594db6751810d8fc72b4..626b9f4f500f7e5270d8d59e6eb16dbfa3efbc7c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -966,6 +966,8 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer) static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) { + struct tcp_sock *tp = tcp_sk(sk); + ktime_t expire, now; u64 len_ns; u32 rate; @@ -977,12 +979,29 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) len_ns = (u64)skb->len * NSEC_PER_SEC; do_div(len_ns, rate); - hrtimer_start(&tcp_sk(sk)->pacing_timer, - ktime_add_ns(ktime_get(), len_ns), + + now = ktime_get(); + /* If hrtimer is already armed, then our caller has not + * used tcp_pacing_check(). + */ + if (unlikely(hrtimer_is_queued(&tp->pacing_timer))) { + expire = hrtimer_get_softexpires(&tp->pacing_timer); + if (ktime_after(expire, now)) + now = expire; + if (hrtimer_try_to_cancel(&tp->pacing_timer) == 1) + __sock_put(sk); + } + hrtimer_start(&tp->pacing_timer, ktime_add_ns(now, len_ns), HRTIMER_MODE_ABS_PINNED_SOFT); sock_hold(sk); } +static bool tcp_pacing_check(const struct sock *sk) +{ + return tcp_needs_internal_pacing(sk) && + hrtimer_is_queued(&tcp_sk(sk)->pacing_timer); +} + static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb) { skb->skb_mstamp = tp->tcp_mstamp; @@ -2117,6 +2136,9 @@ static int tcp_mtu_probe(struct sock *sk) if (!tcp_can_coalesce_send_queue_head(sk, probe_size)) return -1; + if (tcp_pacing_check(sk)) + return -1; + /* We're allowed to probe. Build it now. */ nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false); if (!nskb) @@ -2190,11 +2212,6 @@ static int tcp_mtu_probe(struct sock *sk) return -1; } -static bool tcp_pacing_check(const struct sock *sk) -{ - return tcp_needs_internal_pacing(sk) && - hrtimer_is_queued(&tcp_sk(sk)->pacing_timer); -} /* TCP Small Queues : * Control number of packets in qdisc/devices to two packets / or ~1 ms. > > neal