Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp598658pxk; Thu, 17 Sep 2020 10:58:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwC6NYntXM7AblT5NGqAQsh7g6iikfpgz7uf5+lBUdoJ5dxsbu/vlwMjoFi7zBGbV4FCTfw X-Received: by 2002:a17:906:ecf1:: with SMTP id qt17mr32705730ejb.158.1600365494526; Thu, 17 Sep 2020 10:58:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600365494; cv=none; d=google.com; s=arc-20160816; b=PvbM25uT+JhOA4fT1fA2LJtLkYkBsQxxRZAYXhYYNJdr32bNe00K8n4PQyk8+0QzQ+ wqAu4S+fli8KvhFoZZIO3wNwJUtzWgtbuq/20AC5JJNT5AcCzAsxLq2OE2arjS/9GZPC cc47t4klVkveIVA+bKeWUmenXKTaEvB9I02OU3wwb2+zZ3HYbJFwTeC3efe8uW3Fk97r QNpp2lm81AOSJDEOhTWmP9CgcJVSVj0vbxVI1BRMIPzCaY8zqPDiRR4CsLzw+8vj1Ed1 pLKkznCtInYJDEtItpdlzy3K1TKQ3iYmK0yAG3AnJ1hJkhuWQrAYbV0EE/V1EFGgCpqy xCmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:message-id:references:in-reply-to :subject:cc:to:from:date:content-transfer-encoding:mime-version :dkim-signature; bh=pg6KaEx324bDub42qXfgj8NNuvTJUXFoa4wh2VHcn/c=; b=snaADUk6lHU/ZliPs7dBx1jRQ0aqMifogy1IjRf1aQSSlce9nJd5HRQWVK6JJtV3Zw zz6NkZOLZIk1BFv4kvjF5wxKipA1LbEHwVh5uqZ8XRk56GAItdaG1Raj/xBBXMWl7J/V WFSzjvOwfCs92nHNMVmd9u1UHsbrAd8fom3eAvLbTE+7Ig1CUGI8aTr53wdApi9X2IYP KmpVyeWj+V6kC36bxlfBRhsLqg8w2iY1CxN4NRcYxOrqxJC+dFaNR4LeqZ+hH5YWxaST omh3iqaLAY2blE0SFWwhqxfm4Rhoq9492cV3iq3jUI3xUqxlBJIPHAx34imjChwbd053 1Sbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=E7A2KOsB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y12si294898edv.98.2020.09.17.10.57.50; Thu, 17 Sep 2020 10:58:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=E7A2KOsB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726587AbgIQRyY (ORCPT + 99 others); Thu, 17 Sep 2020 13:54:24 -0400 Received: from m43-7.mailgun.net ([69.72.43.7]:31815 "EHLO m43-7.mailgun.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726381AbgIQRyS (ORCPT ); Thu, 17 Sep 2020 13:54:18 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1600365246; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=pg6KaEx324bDub42qXfgj8NNuvTJUXFoa4wh2VHcn/c=; b=E7A2KOsBKe3xAjn7uwPYaQwEv/WxCGipZX2UepS8jCx0vNdK8ZywhNzgx0HwxMT6L+kPgjo3 WwaFvN9EKQCeTMlTmD++BoM7CCUKytTmSKEb37QH3S1udmwfmD/dLi0biTACc5ZBGZRTYl5v gKjDII/R+qMGMNp0TjglrYAei5k= X-Mailgun-Sending-Ip: 69.72.43.7 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n05.prod.us-west-2.postgun.com with SMTP id 5f63a2bdea858627d501668d (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Thu, 17 Sep 2020 17:54:05 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 37DAFC433F1; Thu, 17 Sep 2020 17:54:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: cgoldswo) by smtp.codeaurora.org (Postfix) with ESMTPSA id 344DFC433CA; Thu, 17 Sep 2020 17:54:04 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 17 Sep 2020 10:54:04 -0700 From: Chris Goldsworthy To: David Hildenbrand Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, pratikp@codeaurora.org, pdaly@codeaurora.org, sudaraja@codeaurora.org, iamjoonsoo.kim@lge.com, linux-arm-msm-owner@vger.kernel.org, Vinayak Menon , linux-kernel-owner@vger.kernel.org Subject: Re: [PATCH v2] mm: cma: indefinitely retry allocations in cma_alloc In-Reply-To: References: <06489716814387e7f147cf53d1b185a8@codeaurora.org> <1599851809-4342-1-git-send-email-cgoldswo@codeaurora.org> <010101747e998731-e49f209f-8232-4496-a9fc-2465334e70d7-000000@us-west-2.amazonses.com> <72ae0f361df527cf70946992e4ab1eb3@codeaurora.org> Message-ID: <5cfa914fca107d884aa845b9273ec656@codeaurora.org> X-Sender: cgoldswo@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-09-15 00:53, David Hildenbrand wrote: > On 14.09.20 20:33, Chris Goldsworthy wrote: >> On 2020-09-14 02:31, David Hildenbrand wrote: >>> On 11.09.20 21:17, Chris Goldsworthy wrote: >>>> >>>> So, inside of cma_alloc(), instead of giving up when >>>> alloc_contig_range() >>>> returns -EBUSY after having scanned a whole CMA-region bitmap, >>>> perform >>>> retries indefinitely, with sleeps, to give the system an opportunity >>>> to >>>> unpin any pinned pages. >>>> >>>> Signed-off-by: Chris Goldsworthy >>>> Co-developed-by: Vinayak Menon >>>> Signed-off-by: Vinayak Menon >>>> --- >>>> mm/cma.c | 25 +++++++++++++++++++++++-- >>>> 1 file changed, 23 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/mm/cma.c b/mm/cma.c >>>> index 7f415d7..90bb505 100644 >>>> --- a/mm/cma.c >>>> +++ b/mm/cma.c >>>> @@ -442,8 +443,28 @@ struct page *cma_alloc(struct cma *cma, size_t >>>> count, unsigned int align, >>>> bitmap_maxno, start, bitmap_count, mask, >>>> offset); >>>> if (bitmap_no >= bitmap_maxno) { >>>> - mutex_unlock(&cma->lock); >>>> - break; >>>> + if (ret == -EBUSY) { >>>> + mutex_unlock(&cma->lock); >>>> + >>>> + /* >>>> + * Page may be momentarily pinned by some other >>>> + * process which has been scheduled out, e.g. >>>> + * in exit path, during unmap call, or process >>>> + * fork and so cannot be freed there. Sleep >>>> + * for 100ms and retry the allocation. >>>> + */ >>>> + start = 0; >>>> + ret = -ENOMEM; >>>> + msleep(100); >>>> + continue; >>>> + } else { >>>> + /* >>>> + * ret == -ENOMEM - all bits in cma->bitmap are >>>> + * set, so we break accordingly. >>>> + */ >>>> + mutex_unlock(&cma->lock); >>>> + break; >>>> + } >>>> } >>>> bitmap_set(cma->bitmap, bitmap_no, bitmap_count); >>>> /* >>>> >>> >>> What about long-term pinnings? IIRC, that can happen easily e.g., >>> with >>> vfio (and I remember there is a way via vmsplice). >>> >>> Not convinced trying forever is a sane approach in the general case >>> ... >> >> V1: >> [1] https://lkml.org/lkml/2020/8/5/1097 >> [2] https://lkml.org/lkml/2020/8/6/1040 >> [3] https://lkml.org/lkml/2020/8/11/893 >> [4] https://lkml.org/lkml/2020/8/21/1490 >> [5] https://lkml.org/lkml/2020/9/11/1072 >> >> We're fine with doing indefinite retries, on the grounds that if there >> is some long-term pinning that occurs when alloc_contig_range returns >> -EBUSY, that it should be debugged and fixed. Would it be possible to >> make this infinite-retrying something that could be enabled or >> disabled >> by a defconfig option? > > Two thoughts: > > This means I strongly prefer something like [3] if feasible. _Resending so that this ends up on LKML_ I can give [3] some further thought then. Also, I realized [3] will not completely solve the problem, it just reduces the window in which _refcount > _mapcount (as mentioned in earlier threads, we encountered the pinning when a task in copy_one_pte() or in the exit_mmap() path gets context switched out). If we were to try a sleeping-lock based solution, do you think it would be permissible to add another lock to struct page? > 2. The issue that I am having is that long-term pinnings are > (unfortunately) a real thing. It's not something to debug and fix as > you > suggest. Like, run a VM with VFIO (e.g., PCI passthrough). While that > VM > is running, all VM memory will be pinned. If memory falls onto a CMA > region your cma_alloc() will be stuck in an (endless, meaning until the > VM ended) loop. I am not sure if all cma users are fine with that - > especially, think about CMA being used for gigantic pages now. > > Assume you want to start a new VM while the other one is running and > use > some (new) gigantic pages for it. Suddenly you're trapped in an endless > loop in the kernel. That's nasty. Thanks for providing this example. > > If we want to stick to retrying forever, can't we use flags like > __GFP_NOFAIL to explicitly enable this new behavior for selected > cma_alloc() users that really can't fail/retry manually again? This would work, we would just have to undo the work done by this patch / re-introduce the GFP parameter for cma_alloc(): http://lkml.kernel.org/r/20180709122019eucas1p2340da484acfcc932537e6014f4fd2c29~-sqTPJKij2939229392eucas1p2j@eucas1p2.samsung.com , and add the support __GFP_NOFAIL (and ignore any flag that is not one of __GFP_NOFAIL or __GFP_NOWARN). -- The Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project