Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp5554441pxb; Mon, 14 Feb 2022 01:45:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJwn/9WQeVGSz+MZqa3sv7AAp+gANn4XdZVCyaJ8DqhJrl2ZpXld5U9qRekdh35lcujQiIQ5 X-Received: by 2002:aa7:8081:: with SMTP id v1mr13456131pff.74.1644831913239; Mon, 14 Feb 2022 01:45:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644831913; cv=none; d=google.com; s=arc-20160816; b=lu97MmWAA0T5mi00M6jC8aAqZQBT4njnI1ktzfHEcvg2CRrsDxNlKBlDFptmw+FIwK FrWsVhtD1NdlG/4I1GykacfyzkPf+ZrCwW3ss9CeETRJeC/8k+XO6Ptr1VdfLYlzlOtk 2csK4OLOA7ENsR5GmPfF/3bv0T6QicNXUMRm68/WeNeiCnmpJk1q/ZW04fiOkTBXm6gp gdlX2SlTzJJRjdgTkQbV2DENns1+EQAVDCnnKUWP4Gqp9CKkibHwM7hsQfTeEYI3ibNv tDvautQh7kadSCHxOXVGwLpsD/tMjH4JC31h3HlwWMkMMZxT1opXeDEyefwOaTlOAbfS NSgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version; bh=2E7LUC+hPpL3mUwdqAs+4uXl6re7/SgKU21+SiY19Mc=; b=YF2UK/9LBJimSMSzutzbVFt0aKdY9Pjx3EMp86OF8vCrLzs/FL4W2iRL/qBzUZIJ10 Hs9JpyN33nK1nZKZ+Y+lSXFI/BrGraIGXtZyTarHi0wzeVTDn9gw5ZHC4dWg49gciERE YKUl1bd3fzn9soHZEpTMjjE2u0ig+W3gAXC1ztWR3cqBmKVOTv8EQand4D/FA/mnWhqH iWrGz1JFr++1VtVF+HtX/6scAkm5ObB6lnebguk8jhvP9Q0P5J8P3hxvoVes4dZsiw3X 3TZax0a7T76EXaeNi5WCcsKja4QSP8BdIJ3gFGrioya4MFZkCYAHIYeS8jei/BWKQJWd 34ag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ucloud.cn Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u186si2251640pgd.879.2022.02.14.01.44.59; Mon, 14 Feb 2022 01:45:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ucloud.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240164AbiBNEDt convert rfc822-to-8bit (ORCPT + 99 others); Sun, 13 Feb 2022 23:03:49 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:51838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234964AbiBNEDs (ORCPT ); Sun, 13 Feb 2022 23:03:48 -0500 Received: from mail-m2458.qiye.163.com (mail-m2458.qiye.163.com [220.194.24.58]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D15954EA3B; Sun, 13 Feb 2022 20:03:40 -0800 (PST) Received: from smtpclient.apple (unknown [117.48.120.186]) by mail-m2458.qiye.163.com (Hmail) with ESMTPA id C8EC17401B9; Mon, 14 Feb 2022 12:03:38 +0800 (CST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.20.0.1.32\)) Subject: Re: [PATCH] gso: do not skip outer ip header in case of ipip and net_failover From: Tao Liu In-Reply-To: Date: Mon, 14 Feb 2022 12:03:38 +0800 Cc: davem@davemloft.net, yoshfuji@linux-ipv6.org, dsahern@kernel.org, kuba@kernel.org, edumazet@google.com, sridhar.samudrala@intel.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8BIT Message-Id: References: <20220213150234.31602-1-thomas.liu@ucloud.cn> To: Willem de Bruijn X-Mailer: Apple Mail (2.3693.20.0.1.32) X-HM-Spam-Status: e1kfGhgUHx5ZQUtXWQgPGg8OCBgUHx5ZQUlOS1dZCBgUCR5ZQVlLVUtZV1 kWDxoPAgseWUFZKDYvK1lXWShZQUlCN1dZLVlBSVdZDwkaFQgSH1lBWRoaThhWSh4ZQ08fHh0aGh lPVRkRExYaEhckFA4PWVdZFhoPEhUdFFlBWVVLWQY+ X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6Phg6EQw4SjIzMwo6ODFIOAoI PB0aFC5VSlVKTU9PQ0pKT0pCSkNLVTMWGhIXVQ8TFBYaCFUXEg47DhgXFA4fVRgVRVlXWRILWUFZ SkpMVU9DVUpJS1VKQ01ZV1kIAVlBTUlLSTcG X-HM-Tid: 0a7ef6667d798c17kuqtc8ec17401b9 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sorry for bothering, just repost it. > 2022年2月14日 09:28,Willem de Bruijn 写道: > > On Sun, Feb 13, 2022 at 10:10 AM Tao Liu wrote: >> >> We encouter a tcp drop issue in our cloud environment. Packet GROed in host >> forwards to a VM virtio_net nic with net_failover enabled. VM acts as a >> IPVS LB with ipip encapsulation. The full path like: >> host gro -> vm virtio_net rx -> net_failover rx -> ipvs fullnat >> -> ipip encap -> net_failover tx -> virtio_net tx >> >> When net_failover transmits a ipip pkt (gso_type = 0x0103), there is no gso >> performed because it supports TSO and GSO_IPXIP4. But network_header has >> been pointing to inner ip header. > > If the packet is configured correctly, and net_failover advertises > that it can handle TSO packets with IPIP encap, then still virtio_net > should not advertise it and software GSO be applied on its > dev_queue_xmit call. > > This is assuming that the packet not only has SKB_GSO_IPXIP4 correctly > set, but also tunneling fields like skb->encapsulated and > skb_inner_network_header. Thanks very much for your comment! Yes, the packet is correct. Another thing i have not pointed directly is that the pkt has SKB_GSO_DODGY. net_failover do not advertises GSO_ROBUST but virtio_net do. >> --- >> net/ipv4/af_inet.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c >> index 9c465ba..f8b3f8a 100644 >> --- a/net/ipv4/af_inet.c >> +++ b/net/ipv4/af_inet.c >> @@ -1425,10 +1425,18 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb, >> static struct sk_buff *ipip_gso_segment(struct sk_buff *skb, >> netdev_features_t features) >> { >> + struct sk_buff *segs; >> + int nhoff; >> + >> if (!(skb_shinfo(skb)->gso_type & SKB_GSO_IPXIP4)) >> return ERR_PTR(-EINVAL); >> >> - return inet_gso_segment(skb, features); >> + nhoff = skb_network_header(skb) - skb_mac_header(skb); >> + segs = inet_gso_segment(skb, features); >> + if (!segs) >> + skb->network_header = skb_mac_header(skb) + nhoff - skb->head; >> + >> + return segs; >> } > > If this would be needed for IPIP, then the same would be needed for SIT, etc. > > Is the skb_network_header > > 1. correctly pointing to the outer header of the TSO packet before the > call to inet_gso_segment > 2. incorrectly pointing to the inner header of the (still) TSO packet > after the call to inet_gso_segment > > inet_gso_segment already does the same operation: save nhoff, pull > network header, call callbacks.gso_segment (which can be > ipip_gso_segment->inet_gso_segment), then place the network header > back at nhoff. > values print in skb_mac_gso_segment() before callbacks.gso_segment: ipip: vlan_depth=0 mac_len=0 skb->network_header=206 net_failover: vlan_depth=14 mac_len=14 skb->network_header=186 virtio_net: vlan_depth=34 mac_len=34 skb->network_header=206 agree to add sit/ip4ip6/ip6ip6, and patch can be simplified as: diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 9c465ba..72fde28 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1376,8 +1376,11 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb, } ops = rcu_dereference(inet_offloads[proto]); - if (likely(ops && ops->callbacks.gso_segment)) + if (likely(ops && ops->callbacks.gso_segment)) { segs = ops->callbacks.gso_segment(skb, features); + if (!segs) + skb->network_header = skb_mac_header(skb) + nhoff - skb->head; + } if (IS_ERR_OR_NULL(segs)) goto out; diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index b29e9ba..5f577e2 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -114,6 +114,8 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb, if (likely(ops && ops->callbacks.gso_segment)) { skb_reset_transport_header(skb); segs = ops->callbacks.gso_segment(skb, features); + if (!segs) + skb->network_header = skb_mac_header(skb) + nhoff - skb->head; } if (IS_ERR_OR_NULL(segs))