Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp2015002pxb; Sun, 10 Jan 2021 20:47:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJwo7sJCOtMfVssBjuMPumNqPoSkB6/fq5JfM6NAWrknTuCmb/1CXNW1dRKpj56imDkmP0L3 X-Received: by 2002:a17:907:2070:: with SMTP id qp16mr9456612ejb.503.1610340424682; Sun, 10 Jan 2021 20:47:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610340424; cv=none; d=google.com; s=arc-20160816; b=DIeUMDVlbsQfy641Zo4g7q59kliymURb4i84je99yG5v2TpRrBa0I7bElVb0SlitzU 5g51tFP0vly3OG7ggT9wcUU3dYHpHiAvrl+rgwVYwlthLpWb3y5b0Saz3vB4jifveWak ZNWzIEQldhE6Als2aEHR2M/KD+041tjf2otxhCq4516mU1ZB/2VdYQ7lyVJOfGpZPV5t DZ6BWE8Bm3Kv0fbsyZP5one3ZQLlJjN0miGGfrMP40F51sI1CLVzsk9ZfwDk5KqwC7ld MaHXFcFn86WmtPLxPFE5sHNdGKfTCSfS5U6f2OPIL4CLlo1IGNUK+ZMsY+9Ff8chW30r 5pvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=g4dO/ZLr0ZIE4Ng1itVD5IzqiR/Cecvu81AcHLMbTek=; b=w+NbxteFaRWg9dK9YSFDWoM4uds7gUwqljvzMNcdKOhawTWgFNfOOagMiO5ewY8oN+ reaSlmWZArwiSS4E0x3qWBgomHtohV0BsYFtV9p1Al50K+s/gywpjp/DaW5ADX3sqSnO awnMsJdjmdoEplUmetnh+qxD9t9WWACuUCuRipRlilRLIqD4gdQFPRds/wcIBDQAgGlg IqLiLGOLChHgavY7Ja1m5pXO2k/rZUWHk3tr4i7P57Cyt2/7AzwDf/K/AGmOGxs8xqPJ r3Jnv69nQg0huaz8Q4GcAJFFl+z+OYaSVNapIN2lTci9UT31JFhKjtBcRuw4pu10g18s SxZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=HMplX0lM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hb35si5217030ejc.396.2021.01.10.20.46.41; Sun, 10 Jan 2021 20:47:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=HMplX0lM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727041AbhAKEmj (ORCPT + 99 others); Sun, 10 Jan 2021 23:42:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726177AbhAKEmj (ORCPT ); Sun, 10 Jan 2021 23:42:39 -0500 Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03237C061786 for ; Sun, 10 Jan 2021 20:41:59 -0800 (PST) Received: by mail-lf1-x12e.google.com with SMTP id o19so35962086lfo.1 for ; Sun, 10 Jan 2021 20:41:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=g4dO/ZLr0ZIE4Ng1itVD5IzqiR/Cecvu81AcHLMbTek=; b=HMplX0lMqcUFUOxW4HCbUvqChmvav3l7jLVJtDZu27s6/KkVZBH1fv4B3IBwQU6sEJ +tf9G88binXZG1goF8xBvgMkYphk3hww2vhFEFZD6L2mdfrkFEYlvj6GzthC6LQ7G7ap sDbVmbS0b/+CuvYsKTePVZMAcgTOufIayaorZjWs4zZ9A+DH6BmuwTMTpzwi6u78C2iS R7IntfdHLsz7zar7SioPoymT7bNpVDQTqKvtqtDVFYOqnkvZMPyQWk7ys71UJ48m+J1x TXbutsYZ+POZtwfFI7E7Elsy+OSUV3YL7WOBJmWcOaqLo8jAzDsOmZqkbiHCsqPNY5N2 ztKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=g4dO/ZLr0ZIE4Ng1itVD5IzqiR/Cecvu81AcHLMbTek=; b=CvsGd5rz/Oo9vOmVAd40qTtKs3BIROz0mEKXcX6gPrKJPJGUjHavcVrCDOoBHkO3aN S498orAya6Z/n+clSGYn5nh5dN3hN2/Vftaji4wibO6keisIjFdVEZbOfPVbaZ7Yjakh KTTp6fSEhp9EYbeOiogqr8e7+Jece1JvloAUBeuWTSxzLvVUkzN3S9Hyo+to0z1VWOkQ Rq+y8eFLt7vn29Wm7hcKvDzCvttF/tvjNWQ+FOxDn1scEZs9+JDhW7wTJ5wVjSKU+X0t 8BtUv1H2ryIGcx/yI4Nz21jUz8vjMqYrBFHTYpINnKpJ2wNPRw/B1QltQAAFOTF6mE5e bZTQ== X-Gm-Message-State: AOAM532yXboFgwTGqmXrxd574dv28Vx6Div8X8U2jY+6KnlNvgYNUasI 1BMF/lJqUMTadbcxjKTx8Aa91NPUc5WydakkZRY= X-Received: by 2002:a19:804a:: with SMTP id b71mr6132913lfd.504.1610340116091; Sun, 10 Jan 2021 20:41:56 -0800 (PST) MIME-Version: 1.0 References: <20210106035027.GA1160@open-light-1.localdomain> In-Reply-To: From: Liang Li Date: Mon, 11 Jan 2021 12:41:40 +0800 Message-ID: Subject: Re: [PATCH 4/6] hugetlb: avoid allocation failed when page reporting is on going To: Alexander Duyck Cc: Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , linux-mm , LKML , virtualization@lists.linux-foundation.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > > Please don't use this email address for me anymore. Either use > > > alexander.duyck@gmail.com or alexanderduyck@fb.com. I am getting > > > bounces when I reply to this thread because of the old address. > > > > No problem. > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > > index eb533995cb49..0fccd5f96954 100644 > > > > --- a/mm/hugetlb.c > > > > +++ b/mm/hugetlb.c > > > > @@ -2320,6 +2320,12 @@ struct page *alloc_huge_page(struct vm_area_struct *vma, > > > > goto out_uncharge_cgroup_reservation; > > > > > > > > spin_lock(&hugetlb_lock); > > > > + while (h->free_huge_pages <= 1 && h->isolated_huge_pages) { > > > > + spin_unlock(&hugetlb_lock); > > > > + mutex_lock(&h->mtx_prezero); > > > > + mutex_unlock(&h->mtx_prezero); > > > > + spin_lock(&hugetlb_lock); > > > > + } > > > > > > This seems like a bad idea. It kind of defeats the whole point of > > > doing the page zeroing outside of the hugetlb_lock. Also it is > > > operating on the assumption that the only way you might get a page is > > > from the page zeroing logic. > > > > > > With the page reporting code we wouldn't drop the count to zero. We > > > had checks that were going through and monitoring the watermarks and > > > if we started to hit the low watermark we would stop page reporting > > > and just assume there aren't enough pages to report. You might need to > > > look at doing something similar here so that you can avoid colliding > > > with the allocator. > > > > For hugetlb, things are a little different, Just like Mike points out: > > "On some systems, hugetlb pages are a precious resource and > > the sysadmin carefully configures the number needed by > > applications. Removing a hugetlb page (even for a very short > > period of time) could cause serious application failure." > > > > Just keeping some pages in the freelist is not enough to prevent that from > > happening, because these pages may be allocated while zero out is on > > going, and application may still run into a situation for not available free > > pages. > > I get what you are saying. However I don't know if it is acceptable > for the allocating thread to be put to sleep in this situation. There > are two scenarios where I can see this being problematic. > > One is a setup where you put the page allocator to sleep and while it > is sleeping another thread is then freeing a page and your thread > cannot respond to that newly freed page and is stuck waiting on the > zeroed page. > > The second issue is that users may want a different option of just > breaking up the request into smaller pages rather than waiting on the > page zeroing, or to do something else while waiting on the page. So > instead of sitting on the request and waiting it might make more sense > to return an error pointer like EAGAIN or EBUSY to indicate that there > is a page there, but it is momentarily tied up. It seems returning EAGAIN or EBUSY will still change the application's behavior, I am not sure if it's acceptable. Thanks Liang