Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp5080626rwb; Mon, 31 Jul 2023 18:09:32 -0700 (PDT) X-Google-Smtp-Source: APBJJlEQFWupsYlgDh9n2ART/gzl0cE+uZ49qbM8A3TjlZnE60i+EqcDQjuRLhjRY2+y91WHKv91 X-Received: by 2002:a17:906:8a77:b0:99b:4e70:d09b with SMTP id hy23-20020a1709068a7700b0099b4e70d09bmr1016072ejc.46.1690852172175; Mon, 31 Jul 2023 18:09:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690852172; cv=none; d=google.com; s=arc-20160816; b=g5iCTSpscahZHJnl5gfQgQSBZDm+x8NwfYWIRzKIeX7/iMEmY2LdHIHOS44bS0+lPy rlwfjwInn4Owuk9aPbq4gqirej+PNFm1akYAypwgjX3OhxSUaOpDXyvmIDwSSKpC6qm3 QrQ3sIWERDI2Tb/rnXytNTcRzrP3psLiMi0Kp5rV3JimkBr1TI6S+lvdCD+SeJvTyrRw hYd4/0NxAX56+EuROvKXaikiVEYMJiW4wZYxe6jUJ+imas51o3M6ci5NTCn3rX9B5B8L sAl2IGAVXPBQvZXDSVedhJgvM7Ey0LwLabMt5SCUxzTWelo2a4iize8ZrxnhxXxmQJSt NfLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=4WL3gOAOsh8o0LXKeeq1iHRytJLE+8+Td/rKsOuxt3c=; fh=GO0IyLN1GTX2QAA2ka6TV70Ize3sdrnke/p7wPLdrcE=; b=unByW+rlFw4fRIZN4UljcpWmEslitmwH/t28pgMtNNw3gufUx718c0mh3xXAss3wgb JKAaNLv5v0WUhQiXiBmVAVKSY3qfmODoePMK8w7lEZ5MmIRaY0vuQ0veUnWIqC5JysHa A+ivDM29Yau5twzGxAZvfWK3W6t4aQMjCfYnGfcYgqR7+k7jqrgDHI+FB8mN+0ljhw7v GzTJHDZDJ37uGFSrswXxBsSd5fy/4mDT3LdQc3sCM7xs6H7p639b4kxvr78Rjh6MxLYJ dEf2d9hNAZ4NK4z9C3qep7JSqQPPFRJfg0WJZKODwAgMyrTARbTNMQ5QOM+xp8kE2dKX XOBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=ifhJ1b6g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id rk12-20020a170907214c00b00993a68a3af5si7707640ejb.529.2023.07.31.18.09.08; Mon, 31 Jul 2023 18:09:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=ifhJ1b6g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231477AbjGaXCH (ORCPT + 99 others); Mon, 31 Jul 2023 19:02:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230301AbjGaXCE (ORCPT ); Mon, 31 Jul 2023 19:02:04 -0400 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B2C31BEB for ; Mon, 31 Jul 2023 16:02:01 -0700 (PDT) Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-5222b917e0cso7387128a12.0 for ; Mon, 31 Jul 2023 16:02:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1690844519; x=1691449319; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4WL3gOAOsh8o0LXKeeq1iHRytJLE+8+Td/rKsOuxt3c=; b=ifhJ1b6gnzdlfV3hSa51bPaBs3eS8DGeEYJ1X56dUdEAsywWSe+UBM5K8J0teZorOw ZCeoSk2Wpbudv9QJbyUSVS61XtzgkEZLhUMfJhWJ/kMLB46f6ozvmjwPE8zS0H2Q+aBI xVI7E+FKWFVFUFnmjndaHdAva+vDU9TxrTfD0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690844519; x=1691449319; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4WL3gOAOsh8o0LXKeeq1iHRytJLE+8+Td/rKsOuxt3c=; b=eVfM+eayA7baeEaUEU7zRZBX+cis/iRUakQjKE1WITHKL+cSsFx1g3KuaaQkMwLEa/ 5cRPzPUtrY2MU7vJHeW3D9Ta+bVAbelb+XTbUyMnsCykZy8gm9Du92ZZdJGR9eIrB8CZ NsNLsEHCj2hwBqBGi+M1Hf2JQDo3S3mk8UpWHoKOROR7zU3MIUaQaFCEJBAleBTncEFO nk7ymC3WnIc7NhcMCXQhqCcUOW5CXnTd/kWAfwCHv5YZ+yO07KNn+DVLO6oBXKOWS5kT PB3RauEnDghmJOFnIm2jx2jHXC38u3M0g/EQuIOyK0M5DnbEti478o7PVKUlYVXzS8qP PHNQ== X-Gm-Message-State: ABy/qLaIk3hl/yecK5JETCJfRmvJx+6WDHT1Oks4HCyrCiPMauahOtVN TC1kEV5iv+DGM7cSiu3QdTUgc/7QWWqZp9GtQbEz0w== X-Received: by 2002:a05:6402:14d1:b0:522:ba6c:9b1b with SMTP id f17-20020a05640214d100b00522ba6c9b1bmr880560edx.26.1690844519499; Mon, 31 Jul 2023 16:01:59 -0700 (PDT) MIME-Version: 1.0 References: <266ab56e-ae83-7ddc-618e-3af228df81bd@linux.dev> <2f285967-6cc0-c492-6a79-edc233c1368e@linux.dev> In-Reply-To: <2f285967-6cc0-c492-6a79-edc233c1368e@linux.dev> From: Yan Zhai Date: Mon, 31 Jul 2023 18:01:48 -0500 Message-ID: Subject: Re: [PATCH v4 bpf 1/2] bpf: fix skb_do_redirect return values To: Martin KaFai Lau Cc: bpf@vger.kernel.org, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Mykola Lysenko , Shuah Khan , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@cloudflare.com, Jordan Griege , Markus Elfring , Jakub Sitnicki Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 31, 2023 at 5:11=E2=80=AFPM Martin KaFai Lau wrote: > > On 7/31/23 2:35 PM, Yan Zhai wrote: > > On Fri, Jul 28, 2023 at 5:02=E2=80=AFPM Martin KaFai Lau wrote: > >> > >> On 7/25/23 6:08 PM, Yan Zhai wrote: > >>> skb_do_redirect returns various of values: error code (negative), > >>> 0 (success), and some positive status code, e.g. NET_XMIT_CN, > >>> NET_RX_DROP. Commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel > >>> infrastructure") didn't check the return code correctly, so positive > >>> values are propagated back along call chain: > >>> > >>> ip_finish_output2 > >>> -> bpf_xmit > >>> -> run_lwt_bpf > >>> -> skb_do_redirect > >> > >> From looking at skb_do_redirect, the skb_do_redirect should have con= sumed the > >> skb except for the -EAGAIN return value. afaik, -EAGAIN could only hap= pen by > >> using the bpf_redirect_peer helper. lwt does not have the bpf_redirect= _peer > >> helper available, so there is no -EAGAIN case in lwt. iow, skb_do_redi= rect > >> should have always consumed the skb in lwt. or did I miss something? > >> > >> If that is the case, it feels like the fix should be in run_lwt_bpf() = and the > >> "if (ret =3D=3D 0)" test in run_lwt_bpf() is unnecessary? > >> > >> ret =3D skb_do_redirect(skb); > >> if (ret =3D=3D 0) > >> ret =3D BPF_REDIRECT; > >> > >> > > Just fixing skb redirect return code won't be sufficient. I realized > > there are other return paths that need to be treated, e.g. bpf reroute > > path also directly returns dev_queue_xmit status. I plan to check for > > LWTUNNEL_XMIT_CONTINUE (and change it to a value that does not > > conflict with NET_RX_DROP and NET_XMIT_DROP) in the next revision. On > > the other hand, the return value of NETDEV_TX_BUSY is another hassle. > > I suspect we are talking about different things or I am still missing som= ething. > > I was thinking skb_do_redirect() should have always consumed the skb and > bpf_xmit should always return LWTUNNEL_XMIT_DONE also (instead of > LWTUNNEL_XMIT_CONTINUE described in the this patch commit message). It is= what > sch_handle_egress() is doing also. Could you explain how is it different = from > the skb_do_redirect usage in sch_handle_egress() or you are suggesting th= e > current sch_handle_egress() has the issue too also? > I think we were not on the same page. You are absolutely right that skb_do_redirect should consume the packet anyway. The difference between your proposal and this patch is that this patch returns errno or LWTUNNEL_XMIT_DONE, and yours does not even return errno. Both approaches fix the issue of "redirect to down device crashes the kernel". What I commented was an exact same issue at different location: BPF reroute may trigger the crash as well, since it also returns dev_queue_xmit status in bpf_xmit. Need to fix this, or instead fixing LWTUNNEL_XMIT_CONTINUE value and correct the behavior at lwtunnel_xmit rather than bpf_xmit. Yan > > > As Dan suggested, packets might not have been freed when this is > > returned from drivers. The caller of dev_queue_xmit might need to free > > skb when this happens. > > > > Yan >