Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp1153307ybg; Thu, 4 Jun 2020 02:22:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzZfdMvBJB6+kWzfwfA8AX3rbWifXfxLEGH8OJ8Ezyn0bVfBhQ7Nspd0WsGydMBdtatnn9Z X-Received: by 2002:a17:906:5e05:: with SMTP id n5mr2991067eju.278.1591262560281; Thu, 04 Jun 2020 02:22:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591262560; cv=none; d=google.com; s=arc-20160816; b=azNhh3gMs7BMVgE9jSyonG0po34t1ayB0eAqhKDhb6Yk+h/B7USQXcH/YUCoH0iOAa KGS7m6uxoEPAlwgTy7nPV2V33wLdbOj15YGIgB4jFQVphUm/RzjyPG2AA1xlt/3rRL2p I3zgozsZIEZZN5BgEJ8tA+cwvchF8xLI6gHr7rCKw20NnrLsD2ufqeFTr1fgq3FGDr7h eBr2+9xlKHsq7CNostf8t5ial8npMXlyNSeovzkaI8ZzkMQWBInruyJRW4zpT5vXe5zi BZkE5xrlH0gTZlwzK2lKTNdDxW+FDhdYHXN9lEEf1X+YWHTEB+ZifkMW/VBIJ6IAYdcK ySiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=niXbj2ikmDqITdGXIco/ydF534thB0b9rEZsBYeN5ew=; b=IT5H1oiJwsZjkSuc4hqjSj1c/6i6GPQhjSFMptGVup4HmEtvVFYtaEPPIgP6SnHUWk Lv/DKblMao1NFXpZhREeGPJFoULe/R2nTqFfu5Hg8GDlTG1bRRqTAoBAc+3Y+51Y/28C LHlC212j88hdligOWbcUfTvsantDpiIv7cOF+n13+h3/l6poTCZ9EJet5KJV/C96U/2O 15/f/gnb2jGNMx22FV/Bd2KkTB1DNqtheDmYgOIPqdxCeDnwb7LfgM//rmjLT0lMGzDL 057jlYA8lPPZgN/dQa25w4PEzHTwpIabxkTUMLm+HCdCfuiUNuFLn8rMVGPBmAofoBMt znRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="HByN/Qzr"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gu10si1240810ejb.38.2020.06.04.02.22.17; Thu, 04 Jun 2020 02:22:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="HByN/Qzr"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727976AbgFDIlZ (ORCPT + 99 others); Thu, 4 Jun 2020 04:41:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726802AbgFDIlY (ORCPT ); Thu, 4 Jun 2020 04:41:24 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23CC7C05BD1E; Thu, 4 Jun 2020 01:41:24 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id s1so6252904ljo.0; Thu, 04 Jun 2020 01:41:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=niXbj2ikmDqITdGXIco/ydF534thB0b9rEZsBYeN5ew=; b=HByN/Qzrccr9TUDTBqlH9sp5SnbAcfC6WAcWM6JKC4n811VDcHYkwm0/u3OmxAelV6 cZPj/gfkX9vU+Zc/qQEUepo0E2+b93MXuasEqWfIYcNap2BIxcVbpaOlGZaa0sBMZ9wN gGhVbz9x+Uz7vjjpB+F5jPdJefWJsEEXEia/Cb6Qumc7TDCesy65BXxZSZnHjjO1PxvA oxjRG1VCJ6YprOF5ivxlzOqAnklDDBPpUqcaoSLjwCqXv9SyIt2dWxguBKNdH/sWirDM HWbi89SSdxpt7mNbObd/2Z7XbapZOYzNqP1r8KHp4pggJXtOjIcIwDrtVzb0+Srgw8Nn AtFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=niXbj2ikmDqITdGXIco/ydF534thB0b9rEZsBYeN5ew=; b=IoAdJYzIMHRxi24J9krpNTGJ1RemSgWpqZe/sMKvJiy0lwrCuwkSqwH5qhm+zoL/Y6 egthvpCORK5JHq+YSLtfbQh6h7DB5tfyGXkYpvToZSZucVNFvEjnwNyULWpCEOEHOJ90 vNt4TH2KjXpvPlg30Kp3hIhYQcS7q38dQcsTz7Gg0UeyCufl56G+FNfoFpM9Yg1Wm9+5 Sk8mHnKDIaqxPa264EqObCyRyruXc76QI76EOtr8f7Vy9SI7BqtqdAXnNbSyRdJ0TBPN e7mqF0FYdqZmRwfjVnif8UGgLyViXQWqpMxqxaZEnPvGpoJIBM5aVz+bDeeUVvE2XLuy y5EQ== X-Gm-Message-State: AOAM532/0uhB1BVM2EdbpEj7g4RaKuu3VFb6lgH9r6sgckqc+wehaGeE VsEdvx9ubeyB5cMJMLBSE9Cowh4PVfeTdCnqkOo= X-Received: by 2002:a2e:8e64:: with SMTP id t4mr1692370ljk.414.1591260082504; Thu, 04 Jun 2020 01:41:22 -0700 (PDT) MIME-Version: 1.0 References: <20200602080425.93712-1-kerneljasonxing@gmail.com> In-Reply-To: From: Jason Xing Date: Thu, 4 Jun 2020 16:40:46 +0800 Message-ID: Subject: Re: [PATCH] tcp: fix TCP socks unreleased in BBR mode To: Neal Cardwell Cc: Eric Dumazet , David Miller , Alexey Kuznetsov , Hideaki YOSHIFUJI , netdev , LKML , liweishi , Shujin Li Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 3, 2020 at 10:08 PM Neal Cardwell wrote: > > On Wed, Jun 3, 2020 at 9:55 AM Eric Dumazet wrote: > > > > On Wed, Jun 3, 2020 at 5:02 AM Neal Cardwell wrote: > > > > > > On Wed, Jun 3, 2020 at 1:44 AM Eric Dumazet wrote: > > > > > > > > On Tue, Jun 2, 2020 at 10:05 PM Jason Xing wrote: > > > > > > > > > > Hi Eric, > > > > > > > > > > I'm still trying to understand what you're saying before. Would this > > > > > be better as following: > > > > > 1) discard the tcp_internal_pacing() function. > > > > > 2) remove where the tcp_internal_pacing() is called in the > > > > > __tcp_transmit_skb() function. > > > > > > > > > > If we do so, we could avoid 'too late to give up pacing'. Meanwhile, > > > > > should we introduce the tcp_wstamp_ns socket field as commit > > > > > (864e5c090749) does? > > > > > > > > > > > > > Please do not top-post on netdev mailing list. > > > > > > > > > > > > I basically suggested double-checking which point in TCP could end up > > > > calling tcp_internal_pacing() > > > > while the timer was already armed. > > > > > > > > I guess this is mtu probing. > > > > > > Perhaps this could also happen from some of the retransmission code > > > paths that don't use tcp_xmit_retransmit_queue()? Perhaps > > > tcp_retransmit_timer() (RTO) and tcp_send_loss_probe() TLP? It seems > > > they could indirectly cause a call to __tcp_transmit_skb() and thus > > > tcp_internal_pacing() without first checking if the pacing timer was > > > already armed? > > > > I feared this, (see recent commits about very low pacing rates) :/ > > > > I am not sure we need to properly fix all these points for old > > kernels, since EDT model got rid of these problems. > > Agreed. > > > Maybe we can try to extend the timer. > > Sounds good. > > > Something like : > > > > > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > > index cc4ba42052c21b206850594db6751810d8fc72b4..626b9f4f500f7e5270d8d59e6eb16dbfa3efbc7c > > 100644 > > --- a/net/ipv4/tcp_output.c > > +++ b/net/ipv4/tcp_output.c > > @@ -966,6 +966,8 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer) > > > > static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) > > { > > + struct tcp_sock *tp = tcp_sk(sk); > > + ktime_t expire, now; > > u64 len_ns; > > u32 rate; > > > > @@ -977,12 +979,29 @@ static void tcp_internal_pacing(struct sock *sk, > > const struct sk_buff *skb) > > > > len_ns = (u64)skb->len * NSEC_PER_SEC; > > do_div(len_ns, rate); > > - hrtimer_start(&tcp_sk(sk)->pacing_timer, > > - ktime_add_ns(ktime_get(), len_ns), > > + > > + now = ktime_get(); > > + /* If hrtimer is already armed, then our caller has not > > + * used tcp_pacing_check(). > > + */ > > + if (unlikely(hrtimer_is_queued(&tp->pacing_timer))) { > > + expire = hrtimer_get_softexpires(&tp->pacing_timer); > > + if (ktime_after(expire, now)) > > + now = expire; > > + if (hrtimer_try_to_cancel(&tp->pacing_timer) == 1) > > + __sock_put(sk); > > + } > > + hrtimer_start(&tp->pacing_timer, ktime_add_ns(now, len_ns), > > HRTIMER_MODE_ABS_PINNED_SOFT); > > sock_hold(sk); > > } > > > > +static bool tcp_pacing_check(const struct sock *sk) > > +{ > > + return tcp_needs_internal_pacing(sk) && > > + hrtimer_is_queued(&tcp_sk(sk)->pacing_timer); > > +} > > + > > static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb) > > { > > skb->skb_mstamp = tp->tcp_mstamp; > > @@ -2117,6 +2136,9 @@ static int tcp_mtu_probe(struct sock *sk) > > if (!tcp_can_coalesce_send_queue_head(sk, probe_size)) > > return -1; > > > > + if (tcp_pacing_check(sk)) > > + return -1; > > + > > /* We're allowed to probe. Build it now. */ > > nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false); > > if (!nskb) > > @@ -2190,11 +2212,6 @@ static int tcp_mtu_probe(struct sock *sk) > > return -1; > > } > > > > -static bool tcp_pacing_check(const struct sock *sk) > > -{ > > - return tcp_needs_internal_pacing(sk) && > > - hrtimer_is_queued(&tcp_sk(sk)->pacing_timer); > > -} > > > > /* TCP Small Queues : > > * Control number of packets in qdisc/devices to two packets / or ~1 ms. > > Thanks for your fix, Eric. This fix looks good to me! I agree that > this fix is good enough for older kernels. > I just tested this patch and it worked well. So it also looks good to me :) Nice work! thanks, Jason > thanks, > neal