Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2983484pxu; Mon, 7 Dec 2020 23:45:52 -0800 (PST) X-Google-Smtp-Source: ABdhPJxyF4eTdRbN1jFB8Ya+R2EKpFWvAFYnaiAunB00SPgWYKLanRE1yhnAZORw33tDANZ8uJVu X-Received: by 2002:aa7:d511:: with SMTP id y17mr19142600edq.249.1607413552156; Mon, 07 Dec 2020 23:45:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607413552; cv=none; d=google.com; s=arc-20160816; b=fCJ4ykSx92ETFDB7vdYD81yG2Vs8ctcjKb1SuLN2iLcc3sL6eVuezbwOjWq05FRL8/ n6Y7vjO/ClKI09k/lnWt2ePCT/pL70+UEXl32Myqk2ev5RtpRpsmA2UBNZO9jsIFCazq 8G8FWCkVbw7sC3uSp6HSRRA/39205UciJt+swKjzuFlUQke8Pvm9C9bunew8SQUsaQ6G h4G+EIhmfQPw1fQIu0VKijqTftiCPBnvu/KxrkPInOg+uJPx+OfwiUyUhvRg2G8j8e4K NjR1T5Deg+pqMobG+m8/vCs9dD6N+e3tryBL6otsNJUkRuEEpfbK198fdG2ZOxFmQILD XY8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=T8wT7HHeLn/LSS2Bl2+pBce5pFFczv34J3fkqjLaloI=; b=Xp7coyP7VWJOzbGP3xspNGykdLSJPj1Oeuo5Rbldbebt7SxjPMuKGlRN0fnLxZQiBQ wzAd6AVYvQiYmMKxy60UYn6MZsdR+sCoBeWVzqAMreG3kKjoANmrwer/gNKE7q4T8bff v5Ny8DdgTmBqbDypfan1Rp6JOIrdXfl69+VIEtAMWC9/dWJBINfoIaMjToryhqA3HDr8 1UfVgg9q/vWzGirr/e+HIGhMSOBVaiL7UY6GeAKgtD3Rn/woab4vDqGr9v7qLMbZOySm 86R5t1c7y0i3Mp8zlaugg8nQPsmfAe8X7Fg+JItfLIpkB6hBFFxGghp4Un2ybpALqX7m 1q8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.co.jp header.s=amazon201209 header.b=Ood+wUco; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.co.jp Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z18si9446378edi.229.2020.12.07.23.45.27; Mon, 07 Dec 2020 23:45:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.co.jp header.s=amazon201209 header.b=Ood+wUco; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.co.jp Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727610AbgLHHnW (ORCPT + 99 others); Tue, 8 Dec 2020 02:43:22 -0500 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:45786 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727553AbgLHHnW (ORCPT ); Tue, 8 Dec 2020 02:43:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.jp; i=@amazon.co.jp; q=dns/txt; s=amazon201209; t=1607413401; x=1638949401; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=T8wT7HHeLn/LSS2Bl2+pBce5pFFczv34J3fkqjLaloI=; b=Ood+wUcoEnEXOcMzIkB1oGV38JpRtYr8rizv3DlWdb4+OpT6N12lr449 SS1QhFG30sBfZm8Mupxj6F4/CKEqLSHF8PKkWkPojsy8oAEeeBudHvM2I 5GqIhvOaWb/TggcgR1idabmsQs1PsKn7ebJbt1mg+c0EtRSnjMtg1y62Y Q=; X-IronPort-AV: E=Sophos;i="5.78,401,1599523200"; d="scan'208";a="101236346" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-2a-119b4f96.us-west-2.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 08 Dec 2020 07:42:41 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-2a-119b4f96.us-west-2.amazon.com (Postfix) with ESMTPS id ED8C81A0B6D; Tue, 8 Dec 2020 07:42:39 +0000 (UTC) Received: from EX13D04ANC001.ant.amazon.com (10.43.157.89) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 8 Dec 2020 07:42:39 +0000 Received: from 38f9d3582de7.ant.amazon.com (10.43.162.53) by EX13D04ANC001.ant.amazon.com (10.43.157.89) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 8 Dec 2020 07:42:34 +0000 From: Kuniyuki Iwashima To: CC: , , , , , , , , , , , Subject: Re: [PATCH v1 bpf-next 03/11] tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues. Date: Tue, 8 Dec 2020 16:42:30 +0900 Message-ID: <20201208074230.35109-1-kuniyu@amazon.co.jp> X-Mailer: git-send-email 2.17.2 (Apple Git-113) In-Reply-To: <20201208065418.ne75jprdbpglrgal@kafai-mbp.dhcp.thefacebook.com> References: <20201208065418.ne75jprdbpglrgal@kafai-mbp.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.43.162.53] X-ClientProxiedBy: EX13D48UWB002.ant.amazon.com (10.43.163.125) To EX13D04ANC001.ant.amazon.com (10.43.157.89) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Martin KaFai Lau Date: Mon, 7 Dec 2020 22:54:18 -0800 > On Tue, Dec 01, 2020 at 11:44:10PM +0900, Kuniyuki Iwashima wrote: > > > @@ -242,8 +244,12 @@ void reuseport_detach_sock(struct sock *sk) > > > > reuse->num_socks--; > > reuse->socks[i] = reuse->socks[reuse->num_socks]; > > + prog = rcu_dereference(reuse->prog); > > > > if (sk->sk_protocol == IPPROTO_TCP) { > > + if (reuse->num_socks && !prog) > > + nsk = i == reuse->num_socks ? reuse->socks[i - 1] : reuse->socks[i]; > I asked in the earlier thread if the primary use case is to only > use the bpf prog to pick. That thread did not come to > a solid answer but did conclude that the sysctl should not > control the behavior of the BPF_SK_REUSEPORT_SELECT_OR_MIGRATE prog. > > From this change here, it seems it is still desired to only depend > on the kernel to random pick even when no bpf prog is attached. I wrote this way only to split patches into tcp and bpf parts. So, in the 10th patch, eBPF prog is run if the type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE. https://lore.kernel.org/netdev/20201201144418.35045-11-kuniyu@amazon.co.jp/ But, it makes a breakage, so I will move BPF_SK_REUSEPORT_SELECT_OR_MIGRATE validation into 10th patch so that the type is only available after 10th patch. ---8<--- case BPF_PROG_TYPE_SK_REUSEPORT: switch (expected_attach_type) { case BPF_SK_REUSEPORT_SELECT: case BPF_SK_REUSEPORT_SELECT_OR_MIGRATE: <- move to 10th. return 0; default: return -EINVAL; } ---8<--- > If that is the case, a sysctl to guard here for not changing > the current behavior makes sense. > It should still only control the non-bpf-pick behavior: > when the sysctl is on, the kernel will still do a random pick > when there is no bpf prog attached to the reuseport group. > Thoughts? If different applications listen on the same port without eBPF prog, I think sysctl is necessary. But honestly, I am not sure there is really such a case and sysctl is necessary. If patcheset with sysctl is more acceptable, I will add it back in the next spin. > > + > > reuse->num_closed_socks++; > > reuse->socks[reuse->max_socks - reuse->num_closed_socks] = sk; > > } else { > > @@ -264,6 +270,8 @@ void reuseport_detach_sock(struct sock *sk) > > call_rcu(&reuse->rcu, reuseport_free_rcu); > > out: > > spin_unlock_bh(&reuseport_lock); > > + > > + return nsk; > > } > > EXPORT_SYMBOL(reuseport_detach_sock);