Return-Path: Received: from szxga04-in.huawei.com ([45.249.212.190]:15093 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730921AbeKMQg4 (ORCPT ); Tue, 13 Nov 2018 11:36:56 -0500 Message-ID: <5BEA71CB.3090003@huawei.com> Date: Tue, 13 Nov 2018 14:40:11 +0800 From: zhong jiang MIME-Version: 1.0 To: Dave Wysochanski CC: Benjamin Coddington , , , , , LKML , Subject: Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. References: <5BD85266.6000301@huawei.com> <1DEE371C-69EB-4D92-8F78-535AA5203007@redhat.com> <5BD86392.7070200@huawei.com> <1541620162.4051.5.camel@redhat.com> In-Reply-To: <1541620162.4051.5.camel@redhat.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Sender: linux-crypto-owner@vger.kernel.org List-ID: On 2018/11/8 3:49, Dave Wysochanski wrote: > On Tue, 2018-10-30 at 21:58 +0800, zhong jiang wrote: >> On 2018/10/30 21:06, Benjamin Coddington wrote: >>> Hi zhong jiang, >>> >>> Try asking in linux-nfs.. but I'll also note that 3.10-stable may >>> be missing a number of fixes to leaks in the NFS GSS code. >>> >>> I can see a more than a few fixes to memory leaks with: >>> git log --grep=leak --oneline net/sunrpc/auth_gss/ >>> >> Thanks for your reply. I has tested some of them in the upsteam as >> you have said. but It fails to solve the issue completely. >> hence, I turn to the relevant experts whether they have happened to >> the issue or can give some suggestion or not. >> >> Thanks, >> zhong jiang >>> Ben >>> >>> On 30 Oct 2018, at 8:45, zhong jiang wrote: >>> >>>> Hi, Herbert >>>> >>>> Recently, I hit a memory leak issue when mounting and >>>> unmounting nfs with the way of krb5. >>>> The issue happens to the linux-3.10-stable. >>>> >>>> I find that slab-1024 and slab-512 will take up most of the >>>> memory. And it can not be freed. >>>> Meanwhile, it result in rpcsec_gss_krb5 can be unregistered as >>>> well. >>>> >>>> > Are you running the latest 3.10-stable? > > This sounds very familiar to something I encountered a while ago and it > was a sunrpc cache related problem. The patch that fixed it for me is > in 3.10.106 though. > > Can you check if this cache is growing indefinitely? > /proc/net/rpc/auth.rpcsec.context > > If it is large, try to flush explicitly with: > date +%s > /proc/net/rpc/auth.rpcsec.context/flush > > If all that checks out, you may need the below upstream fix, but it > went into v3.10.106 as > 6a4a5fd svcrpc: don't leak contexts on PROC_DESTROY > > commit 6a4a5fd4c7bc6a06ca26ad7327d046d8d3c0932a > Author: J. Bruce Fields > Date: Mon Jan 9 17:15:18 2017 -0500 > > svcrpc: don't leak contexts on PROC_DESTROY > > commit 78794d1890708cf94e3961261e52dcec2cc34722 upstream. > > Context expiry times are in units of seconds since boot, not unix time. > > The use of get_seconds() here therefore sets the expiry time decades in > the future. This prevents timely freeing of contexts destroyed by > client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually > (when the module is unloaded or the container shut down), but a lot of > contexts could pile up before then. > > Fixes: c5b29f885afe "sunrpc: use seconds since boot in expiry cache" > Reported-by: Andy Adamson > Signed-off-by: J. Bruce Fields > Signed-off-by: Willy Tarreau > > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c > index 62663a0..e625efe 100644 > --- a/net/sunrpc/auth_gss/svcauth_gss.c > +++ b/net/sunrpc/auth_gss/svcauth_gss.c > @@ -1518,7 +1518,7 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {} > case RPC_GSS_PROC_DESTROY: > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq)) > goto auth_err; > - rsci->h.expiry_time = get_seconds(); > + rsci->h.expiry_time = seconds_since_boot(); > set_bit(CACHE_NEGATIVE, &rsci->h.flags); > if (resv->iov_len + 4 > PAGE_SIZE) > goto drop; > > . > Hi, Dave Thank you for kindly help and reply. and sorry for late reply. Because I just test the patch. It will not work thoroughly. but I unite the following three patches from upstream, the issue will not occur. 0070ed3 Fix 16-byte memory leak in gssp_accept_sec_context_upcall 78794d1 svcrpc: don't leak contexts on PROC_DESTROY a1d1e9b svcrpc: fix memory leak in gssp_accept_sec_context_upcall I think we should backport the relevant patches to stable-3.10. Thanks, zhong jiang