Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp558389img; Thu, 28 Feb 2019 04:16:17 -0800 (PST) X-Google-Smtp-Source: AHgI3IYfnIbJHVIaDxaw7CLeA9o6myGZdZPmk8Avsw6VvpGZrVa4sKVrvhpnzIzPmVBx7EMAorn8 X-Received: by 2002:a17:902:31c3:: with SMTP id x61mr7518839plb.113.1551356176896; Thu, 28 Feb 2019 04:16:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551356176; cv=none; d=google.com; s=arc-20160816; b=oSKfVrUty4sd1GlPbHV6Ync8vGMlfa4wtZ7It6qyJDo9Z5Faok+6CuqWyOVU87XzHT 2xc56/0qC6ewheqGgFFmbBjB/7Ih4ggSfOywn2HjEcKAdARnIsl/uN38MHlOM0Pl8lVL yaczF/IlMwPPya0uG7LunSCznBaNINwlMoaJVdzMtcjrjC8s1xZtgANO5CO+egoU+Rnc OTtmFcUrGcd7N38ncAqDVuqz+qTwrM9RVfHnSdx4WNgKE8AYQ/DPTNYPgfc8Rk2WM6ZH LFUkQnoHXMa6kiFZw/sxJK37XV2EbN3NGT7AvlapRR07jzaz9amNp8xT/x7xirGkaqbx 9U/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=AbtwWYnond5KgJjjc/9MnzZ+cRsvM/8b6V1KBqBRoZA=; b=eLy1T2GQ7C7FKscOQrp600vHSyY10xmZRwBqFpxOmYnUG6vGHQppYLq/iNDRT5RS5c C4nJo76fzu7MHBo3FFORR9Wrk2C/bM+0fx8pRa8F+J4Qr3wbHr59p90keXXGvSSVS+A5 fRn+L7id/V3Nxa8c21j2LHcwHKlPih1vhC7k6rGLaaJO/dLO2tkuoqUxmluvQ2S5HcZ4 Lj5reWtvBDLtaif7+HbIn2U4lof/COqSACfSdcE8EqlP1Zg2eRqOkB4OvVSOL3gm+fTF Qgq2aLrj0gU7L7Oqxnu4UxlZ71YgzcVmSNz/b3TwNCgWI8Ts7arZ4d6/TyoFzXbAqtNf pBOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b66si18103245pfj.106.2019.02.28.04.15.55; Thu, 28 Feb 2019 04:16:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731575AbfB1LBt (ORCPT + 99 others); Thu, 28 Feb 2019 06:01:49 -0500 Received: from smtp03.citrix.com ([162.221.156.55]:18035 "EHLO SMTP03.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727086AbfB1LBt (ORCPT ); Thu, 28 Feb 2019 06:01:49 -0500 X-IronPort-AV: E=Sophos;i="5.58,423,1544486400"; d="scan'208";a="79309425" Date: Thu, 28 Feb 2019 11:01:36 +0000 From: Wei Liu To: Paul Durrant CC: Igor Druzhinin , "xen-devel@lists.xenproject.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Wei Liu , "davem@davemloft.net" Subject: Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure Message-ID: <20190228110136.somjads2f5ivqhju@zion.uk.xensource.com> References: <1551319382-32595-1-git-send-email-igor.druzhinin@citrix.com> <8424fba34a1f43908a6dae3a9308863b@AMSPEX02CL02.citrite.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <8424fba34a1f43908a6dae3a9308863b@AMSPEX02CL02.citrite.net> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 28, 2019 at 09:46:57AM +0000, Paul Durrant wrote: > > -----Original Message----- > > From: Igor Druzhinin [mailto:igor.druzhinin@citrix.com] > > Sent: 28 February 2019 02:03 > > To: xen-devel@lists.xenproject.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org > > Cc: Wei Liu ; Paul Durrant ; davem@davemloft.net; Igor > > Druzhinin > > Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure > > > > Zero-copy callback flag is not yet set on frag list skb at the moment > > xenvif_handle_frag_list() returns -ENOMEM. This eventually results in > > leaking grant ref mappings since xenvif_zerocopy_callback() is never > > called for these fragments. Those eventually build up and cause Xen > > to kill Dom0 as the slots get reused for new mappings. > > > > That behavior is observed under certain workloads where sudden spikes > > of page cache usage for writes coexist with active atomic skb allocations. > > > > Signed-off-by: Igor Druzhinin > > --- > > drivers/net/xen-netback/netback.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > > index 80aae3a..2023317 100644 > > --- a/drivers/net/xen-netback/netback.c > > +++ b/drivers/net/xen-netback/netback.c > > @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue *queue) > > > > if (unlikely(skb_has_frag_list(skb))) { > > if (xenvif_handle_frag_list(queue, skb)) { > > + struct sk_buff *nskb = > > + skb_shinfo(skb)->frag_list; > > if (net_ratelimit()) > > netdev_err(queue->vif->dev, > > "Not enough memory to consolidate frag_list!\n"); > > + xenvif_skb_zerocopy_prepare(queue, nskb); > > xenvif_skb_zerocopy_prepare(queue, skb); > > kfree_skb(skb); > > continue; > > Whilst this fix will do the job, I think it would be better to get rid of the kfree_skb() from inside xenvif_handle_frag_list() and always deal with it here rather than having it happen in two different places. Something like the following... +1 for having only one place. > > ---8<--- > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index 80aae3a32c2a..093c7b860772 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue, > /* Consolidate skb with a frag_list into a brand new one with local pages on > * frags. Returns 0 or -ENOMEM if can't allocate new pages. > */ > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *skb) > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c > index 80aae3a32c2a..093c7b860772 100644 > --- a/drivers/net/xen-netback/netback.c > +++ b/drivers/net/xen-netback/netback.c > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct xenvif_queue *qu > eue, > /* Consolidate skb with a frag_list into a brand new one with local pages on > * frags. Returns 0 or -ENOMEM if can't allocate new pages. > */ > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff * > skb) > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff * > skb, > + struct sk_buff *nskb) > { > unsigned int offset = skb_headlen(skb); > skb_frag_t frags[MAX_SKB_FRAGS]; > int i, f; > struct ubuf_info *uarg; > - struct sk_buff *nskb = skb_shinfo(skb)->frag_list; > > queue->stats.tx_zerocopy_sent += 2; > queue->stats.tx_frag_overflow++; > @@ -1072,11 +1072,6 @@ static int xenvif_handle_frag_list(struct xenvif_queue *q > ueue, struct sk_buff *s > skb_frag_size_set(&frags[i], len); > } > > - /* Copied all the bits from the frag list -- free it. */ > - skb_frag_list_init(skb); > - xenvif_skb_zerocopy_prepare(queue, nskb); > - kfree_skb(nskb); > - > /* Release all the original (foreign) frags. */ > for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) > skb_frag_unref(skb, f); > @@ -1145,7 +1140,11 @@ static int xenvif_tx_submit(struct xenvif_queue *queue) > xenvif_fill_frags(queue, skb); > > if (unlikely(skb_has_frag_list(skb))) { > - if (xenvif_handle_frag_list(queue, skb)) { > + struct sk_buff *nskb = skb_shinfo(skb)->frag_list; > + > + xenvif_skb_zerocopy_prepare(queue, nskb); > + > + if (xenvif_handle_frag_list(queue, skb, nskb)) { > if (net_ratelimit()) > netdev_err(queue->vif->dev, > "Not enough memory to consolidate frag_list!\n"); > @@ -1153,6 +1152,10 @@ static int xenvif_tx_submit(struct xenvif_queue *queue) > kfree_skb(skb); > continue; > } > + > + /* Copied all the bits from the frag list. */ > + skb_frag_list_init(skb); > + kfree(nskb); I think you want kfree_skb here? Wei. > } > > skb->dev = queue->vif->dev; > ---8<--- > > What do you think? > > Paul > > > -- > > 2.7.4 >