Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp37155081rwd; Tue, 11 Jul 2023 10:07:07 -0700 (PDT) X-Google-Smtp-Source: APBJJlFNWhlFEIhtOC4mZKhqxGEi0w18st/9DgI5ihbqvnZ6s/CVlSJjj38i5AsdVSu/FzkI38fJ X-Received: by 2002:a05:6358:93a3:b0:130:afe8:43de with SMTP id h35-20020a05635893a300b00130afe843demr15870953rwb.30.1689095226821; Tue, 11 Jul 2023 10:07:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689095226; cv=none; d=google.com; s=arc-20160816; b=gfzuxF/R0b6uXiBh9bb4lZOPvAB7F6YKN8bOSFdn61cvssYHpOH4WdO2f83efweY6l +avbfVSvIESiIGB1YGEDdScytnMowEiewNWyoXTE4ixie8GQdHJC0ok9diNa5GPgswFn YmJlz9nsChkBN4moOqJ2er2kBtFTrQipxAqZfaKPjybBgLIirxCfLQPLQdOgblMXAF3U QVbHcUsscylurHdy/u1PfDrhtoCNjljrI5RwLndTmqfPEr7OuMBm6j3RrewWByDbvwR8 pamiE+Ob3Q8+uZErCESI3R/zo0aVXmDFQ7vEAheo0YpdOjCIv48Mu7uAy4atOW2nyk65 R74g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=b0cPI8Zq0PeegUi44hkwFDNWJ9aJ3HVoZc/kLcQkpf4=; fh=2GLLOBOjD+NNLE0nosaWJJJxJRidBF6SGJPyyGDAfj4=; b=TEamBgDCP2ZhovvjZxNqktwargCrHPKOALxnU1UzuJdr3d4AeCvrzyj5Bw5GuxIl1Z pbUhvoSwdALvAeT3zNabrKeFACESkDsvLnjAson9RpFo+2U1pqekNfaD+zRlF8dQIym6 VXjTJlHuaYEvemvUuutNsQfo82xW45zHdUYc9sNh3UW9ftxtuQmJC5vYRlQNOMJSALTc 7FnSshNaSex1jiMP9P1pe0M7svw5lygtxvnPKY1N1B4uu6Ww7WcFb9KW01M/UhfwbhHO kDbwYkXBesbCDwESAsinDWJvPdemkWBt4DurhpXo097yLF0XDpwYawi7JiDi8/kvKRB4 d+Iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@isovalent.com header.s=google header.b=UuD8b+Af; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=isovalent.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 7-20020a630907000000b0055386e1eb92si1631515pgj.422.2023.07.11.10.06.53; Tue, 11 Jul 2023 10:07:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@isovalent.com header.s=google header.b=UuD8b+Af; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=isovalent.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231533AbjGKQPd (ORCPT + 99 others); Tue, 11 Jul 2023 12:15:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231825AbjGKQPW (ORCPT ); Tue, 11 Jul 2023 12:15:22 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAB041704 for ; Tue, 11 Jul 2023 09:15:18 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-99313a34b2dso686341566b.1 for ; Tue, 11 Jul 2023 09:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1689092117; x=1691684117; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=b0cPI8Zq0PeegUi44hkwFDNWJ9aJ3HVoZc/kLcQkpf4=; b=UuD8b+Afj9oDvXQEwsLdh6bNZbU5rx8jYmvqtU7VORiVL9R/oX7NzGZJhYCqpd7LRU RMZ4psdDy39BM26u+YM9UyVJUIITTEdeBE2HshuXlY/5/T7uCwIReB1KUeS0+OUTdO3f XJS6BuPYccXg3zKRC0EW8rvRtEyUPmRDMuuZ4FYziBxY3OqPYakIwzuSvT/lx1n+yr+0 AAuLAY2175VPPwnLh1YxdFYVZ1aqt83GSGkH33DBYkTMFSaj47/Ij13OsDIo2rh3q6K2 rM3xeYRk7MzMvAt33Kd+/8lYusal4Y05a0KOr4nfVX2fDwQFBhkaR7RoNFSyyEoZZxXF owKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689092117; x=1691684117; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=b0cPI8Zq0PeegUi44hkwFDNWJ9aJ3HVoZc/kLcQkpf4=; b=cri1/+zF+tBJzcjKJd/ZcR2QgWQZ+YbT6TqRxka0AT/SCYKMuDRgCPq34vnMiG5RqC QQFAFeh3bEkezMA8zyMKiGibHLRI26oKcQikvrj2pNvcSIps4d1rZC00VeJW/JYHuu/V 8Hw/BB0HO39miW9nFO68UPGsZ90H2/kgBBfeNQSm5LnqG6NITKNIsaIOuW2Gul6rkxs1 BLZ6TTcPRf5/EFs92cNlYjvXSQKpzXL3zCAzZdsoieVkY4v09SWl9ss8ifetNp4d+4Fq U+K2t2FN5uuCXtN+TGLNlsLuvij7CRW1yz9tjFAuuuVJB3fLJd5ll050zM64y37yZONS uLDQ== X-Gm-Message-State: ABy/qLaQccLWCXqbrRijJxBvTGcdh+IBKVbOZ+2VCAMrqcQiwQhIO/Le NPOPgNc+xkQyzY8umomxBQ/RWNxXJT5K9q0wkMZTWw== X-Received: by 2002:a17:906:21a:b0:994:1ef9:91dc with SMTP id 26-20020a170906021a00b009941ef991dcmr1792295ejd.15.1689092117275; Tue, 11 Jul 2023 09:15:17 -0700 (PDT) MIME-Version: 1.0 References: <20230613-so-reuseport-v5-0-f6686a0dbce0@isovalent.com> <20230613-so-reuseport-v5-6-f6686a0dbce0@isovalent.com> In-Reply-To: <20230613-so-reuseport-v5-6-f6686a0dbce0@isovalent.com> From: Lorenz Bauer Date: Tue, 11 Jul 2023 17:15:06 +0100 Message-ID: Subject: Re: [PATCH bpf-next v5 6/7] bpf, net: Support SO_REUSEPORT sockets with bpf_sk_assign To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , David Ahern , Willem de Bruijn , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Joe Stringer , Mykola Lysenko , Shuah Khan , Kuniyuki Iwashima Cc: Hemanth Malla , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Joe Stringer Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 4, 2023 at 2:46=E2=80=AFPM Lorenz Bauer wro= te: > > +static inline > +struct sock *inet6_steal_sock(struct net *net, struct sk_buff *skb, int = doff, > + const struct in6_addr *saddr, const __be16 = sport, > + const struct in6_addr *daddr, const __be16 = dport, > + bool *refcounted, inet6_ehashfn_t *ehashfn) > +{ > + struct sock *sk, *reuse_sk; > + bool prefetched; > + > + sk =3D skb_steal_sock(skb, refcounted, &prefetched); > + if (!sk) > + return NULL; > + > + if (!prefetched) > + return sk; > + > + if (sk->sk_protocol =3D=3D IPPROTO_TCP) { > + if (sk->sk_state !=3D TCP_LISTEN) > + return sk; > + } else if (sk->sk_protocol =3D=3D IPPROTO_UDP) { > + if (sk->sk_state !=3D TCP_CLOSE) > + return sk; > + } else { > + return sk; > + } > + > + reuse_sk =3D inet6_lookup_reuseport(net, sk, skb, doff, > + saddr, sport, daddr, ntohs(dpor= t), > + ehashfn); > + if (!reuse_sk) > + return sk; > + > + /* We've chosen a new reuseport sock which is never refcounted. T= his > + * implies that sk also isn't refcounted. > + */ > + WARN_ON_ONCE(*refcounted); > + > + return reuse_sk; > +} Hi Kuniyuki, Continuing the conversation from v5 of the patch set, you wrote: In inet6?_steal_sock(), we call inet6?_lookup_reuseport() only for sk that was a TCP listener or UDP non-connected socket until just before the sk_state checks. Then, we know *refcounted should be false for such sockets even before inet6?_lookup_reuseport(). This makes sense for me in the TCP listener case. I understand UDP less, so I'll have to rely on your input. I tried to convince myself that all UDP sockets in TCP_CLOSE have SOCK_RCU_FREE set. However, the only place I see sock_set_flag(sk, SOCK_RCU_FREE) in the UDP case is in udp_lib_get_port(). That in turn seems to be called during bind. So, what if BPF does bpf_sk_assign() of an unbound and unconnected socket? Wouldn't that trigger the warning? To maybe sidestep this question: do you think the location of the WARN_ON_ONCE has to prevent this patch set from going in? I've been noodling at it for quite a while already and it would be good to see it land. Thanks Lorenz