Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp571970rwi; Thu, 27 Oct 2022 05:12:24 -0700 (PDT) X-Google-Smtp-Source: AMsMyM45F44DAne7BHs/bQtswz1oBDd2izflxIzparG1j9A9CqsLae+39drJiFkygR0uUCE24hpF X-Received: by 2002:a17:907:2c5b:b0:78d:3f8a:19d0 with SMTP id hf27-20020a1709072c5b00b0078d3f8a19d0mr40936959ejc.369.1666872733654; Thu, 27 Oct 2022 05:12:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666872733; cv=none; d=google.com; s=arc-20160816; b=tRUgVBgICI2jBIr+Rr1exyAUGJZYn+mJfLy1nEiojuQOsQyXmWW13o7Yfg2QH60dQy bRXxPf/zSDvmfQYui4H4MTe5JPPJqcECYCLCDez/vvJ9OQhocxBbmN2ulg1+WSF370tE lSSqRAATAmvJqDROw2iM18iEh9ydtCqxGnZm0H9lu6fgabL8HogFAKO9tYM8b+Km15x0 jfqD6+tYW3GwVy6FAGFXGy1UZZP7ne+odtHD9Ju/FoOErqqfpAeGuAeL/GSW9ULc1RqG LtfFtqWVonsqNcRRuZo9V2Sca2ycF9OZZZLb4rkldPjt6d1PYUzqVCu2Lzxvkp2mgPfl Kegg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=PTq4l/ef78moYrpScZ21ig6YKsWSFlDu+eO/n9004eE=; b=bW5HK9HPfNX9I0Q8Q9VltMs7NOGS3+C2SjXw3gJg7v7X0sT1qQgomB508Y4U6C0bMg PbVcBNHW+WbPEaFnQoOQ3CRs7y9G+YdoqB7UwXS9WazeAqeWHD1XqdgKu+QLAYHrD9Z8 OcgcElY8icMNQd+cp62WZYLBUtr90QVltvHXstw6jhULdoNO9j4E+4NKuTaD8SbVI8lr Zfp1C9sw2hPttMuy8UVosILt+IkeAPHi8uKuEmxcpJDJ9WRSmxGUKLVAi9XWean2u44j TM/9O67lMmV8iV9O/Eq9zuWN7rmSVgAt5SsFDINIV8LK6pmq3gJ/52nqIu78h85BvNsW opQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=NEZG0rdC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qk17-20020a1709077f9100b007a0d3365aa9si1420140ejc.136.2022.10.27.05.11.42; Thu, 27 Oct 2022 05:12:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=NEZG0rdC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234596AbiJ0LSo (ORCPT + 99 others); Thu, 27 Oct 2022 07:18:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233956AbiJ0LSm (ORCPT ); Thu, 27 Oct 2022 07:18:42 -0400 Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFF77F4184 for ; Thu, 27 Oct 2022 04:18:41 -0700 (PDT) Received: by mail-yb1-xb34.google.com with SMTP id z192so1538038yba.0 for ; Thu, 27 Oct 2022 04:18:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PTq4l/ef78moYrpScZ21ig6YKsWSFlDu+eO/n9004eE=; b=NEZG0rdCo1EVP07wN70uIzUx69ba88S95t7lJ0Rp01UbhhXdBEKjWyyWo+mcSX2F5d wRSyJPZ0glZA5jZb33SgKnwBKh+LbLRpO3PmIGMnOpA1xNYoABaVDzi8hx6ADZnNYDpt aQQh+pRpGDuBjw52VLk8SN4UXTQ5/r8Rn2psVi677BH70ZFWyjUjGA6YfL6TTGzNMheT 8xzA600c0y8Ky75zZm/hKgJRo9NcgCFKCa+rypppHglMiRZ9ZFQAdMbM0B9peoejIUoQ WpE9V/g8/bdX95GU6UnA+KhPfN/4hHqU0nrtv5H0d72tYUkpkJmBSwoJM0WPdS7lAc6h 9uQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PTq4l/ef78moYrpScZ21ig6YKsWSFlDu+eO/n9004eE=; b=Sz4MU9paXklHgEVn/7/4Pnd0kdocnOn3XYMc/x206NW9HR1UjA90HK6WfVjab54cwy sg+3j/FA9GFvenGTJXXcEq9x8MPggoLU/+6Ro5IsLGCI+8dV9soBzV6wl9w+RB6YQ50q 9NhvxVJ2bTQSgg7e37oL0zWuvSNz2EYkhPPclFhUWNkFL1zYtJ/la/JAYXBdGuhuQqbs iG1KfMezk39u3ZJdgQtnD+LB51tuJLAHN+3dkPtu4ktMpOtP3riPZauaC+gapo2v9iRP +SkF18rz9CnkBn2YvtI0yzkG49V6dtU0HZCvmkSQH7cot0eCWLnMzwjV6Qpt3ZpIPDr1 4EKA== X-Gm-Message-State: ACrzQf1BN3IGyzl1GOKVi5UnnOF5vRUBY3N4r5tG7XqNrrElD3lrAHDy bGip7Z/Dk4fik/jPP8DVjLyWnu4yoGr92NQHzDT1vw== X-Received: by 2002:a25:32ca:0:b0:6ca:40e2:90c8 with SMTP id y193-20020a2532ca000000b006ca40e290c8mr33970276yby.55.1666869520904; Thu, 27 Oct 2022 04:18:40 -0700 (PDT) MIME-Version: 1.0 References: <20221027114930.137735-1-luwei32@huawei.com> In-Reply-To: From: Eric Dumazet Date: Thu, 27 Oct 2022 04:18:30 -0700 Message-ID: Subject: Re: [PATCH net v2] tcp: reset tp->sacked_out when sack is enabled To: Pavel Emelyanov Cc: Lu Wei , davem@davemloft.net, yoshfuji@linux-ipv6.org, dsahern@kernel.org, kuba@kernel.org, pabeni@redhat.com, Pavel Emelyanov , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Pavel Tikhomirov , den@virtuozzo.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 27, 2022 at 4:14 AM Pavel Emelyanov wrote: > > > > =D1=87=D1=82, 27 =D0=BE=D0=BA=D1=82. 2022 =D0=B3., 14:08 Eric Dumazet : >> >> On Thu, Oct 27, 2022 at 3:45 AM Lu Wei wrote: >> > >> > If setsockopt with option name of TCP_REPAIR_OPTIONS and opt_code >> > of TCPOPT_SACK_PERM is called to enable sack after data is sent >> > and before data is acked, it will trigger a warning in function >> > tcp_verify_left_out() as follows: >> > >> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> > WARNING: CPU: 8 PID: 0 at net/ipv4/tcp_input.c:2132 >> > tcp_timeout_mark_lost+0x154/0x160 >> > tcp_enter_loss+0x2b/0x290 >> > tcp_retransmit_timer+0x50b/0x640 >> > tcp_write_timer_handler+0x1c8/0x340 >> > tcp_write_timer+0xe5/0x140 >> > call_timer_fn+0x3a/0x1b0 >> > __run_timers.part.0+0x1bf/0x2d0 >> > run_timer_softirq+0x43/0xb0 >> > __do_softirq+0xfd/0x373 >> > __irq_exit_rcu+0xf6/0x140 >> > >> > This warning occurs in several steps: >> > Step1. If sack is not enabled, when server receives dup-ack, >> > it calls tcp_add_reno_sack() to increase tp->sacked_out. >> > >> > Step2. Setsockopt() is called to enable sack >> > >> > Step3. The retransmit timer expires, it calls tcp_timeout_mark_lost() >> > to increase tp->lost_out but not clear tp->sacked_out because >> > sack is enabled and tcp_is_reno() is false. >> > >> > So tp->left_out is increased repeatly in Step1 and Step3 and it is >> > greater than tp->packets_out and trigger the warning. In function >> > tcp_timeout_mark_lost(), tp->sacked_out will be cleared if Step2 not >> > happen and the warning will not be triggered. As suggested by Denis >> > and Eric, TCP_REPAIR_OPTIONS should be prohibited if data was already >> > sent. >> > >> > socket-tcp tests in CRIU has been tested as follows: >> > $ sudo ./test/zdtm.py run -t zdtm/static/socket-tcp* --keep-going \ >> > --ignore-taint >> > >> > socket-tcp* represent all socket-tcp tests in test/zdtm/static/. >> > >> > Fixes: b139ba4e90dc ("tcp: Repair connection-time negotiated parameter= s") >> > Signed-off-by: Lu Wei >> > --- >> > net/ipv4/tcp.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c >> > index ef14efa1fb70..ef876e70f7fe 100644 >> > --- a/net/ipv4/tcp.c >> > +++ b/net/ipv4/tcp.c >> > @@ -3647,7 +3647,7 @@ int do_tcp_setsockopt(struct sock *sk, int level= , int optname, >> > case TCP_REPAIR_OPTIONS: >> > if (!tp->repair) >> > err =3D -EINVAL; >> > - else if (sk->sk_state =3D=3D TCP_ESTABLISHED) >> > + else if (sk->sk_state =3D=3D TCP_ESTABLISHED && !tp->p= ackets_out) >> >> You keep focusing on packets_out :/ >> >> What I said was : TCP_REPAIR_OPTIONS must be denied if any packets >> have been sent (and possibly already ACK) > > > > If repair mode is ON, why does socket send any packets? This shouldn't ha= ppen from my perspective. Exactly. TCP_REPAIR is easily abused by fuzzers. We need to enforce sanity rules in the kernel, not assuming user space is following the CRIU way of doing things. > > -- Pavel > >> >> Looking at tp->packets_out alone is not sufficient. >> >> > err =3D tcp_repair_options_est(sk, optval, opt= len); >> > else >> > err =3D -EPERM; >> > -- >> > 2.31.1 >> >