Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp490891pxu; Thu, 7 Jan 2021 09:59:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJwE9YuagwE+fDzL5LCimL9SWz52FMCrxgmNsnOBRQpiQvTgiCKpTnjtxnx3uwlE3xNYuT19 X-Received: by 2002:a50:d2c5:: with SMTP id q5mr2464327edg.388.1610042365564; Thu, 07 Jan 2021 09:59:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610042365; cv=none; d=google.com; s=arc-20160816; b=woF3ll5mc00a+EWgwblW3UJTH5DoGAsRCmz5WHhLkDcAlYnYNKxA8SdMkHT4ufEVzw LxCE1zic3Whewgg77qCQnHBDCDnB1O7J3NkGNIANk0NLJ+ML7/IW+Aiqmz/heW9lbxo5 y0OnNRgS340AuHPiYURpiQbxPyOGknty/z+r489g3S8MoA8LizBVDS+3rYisTOZQbK2F /8MTuyAh9IAls5Ryc+cW5NF+Hm4eNHdmvzMx1RW9M0GSfWBhJuA9CbNocDOKq0OFTufJ 2HCQDEbcj7yBHJL/861f44qis6yrWE1lNfwT2GfSFdYuKVAJbF43GmFhZJdyEx3Gw8s6 2sCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=gIcTJnmCObWJEcPsEVSzZEbUrwSNec8NhE63eBq7ZSM=; b=oIrFSdkyOFfUv3NkRnGTCyBqFEktRYrdwIkfUgWBL2hp6Br8vQpHmBqOIQWloiWlTx TefLa6Yu8XlTvcWFaTNdpjxpVyWNN9dyByzYS3hfnxHJaQh3MLdK5Zkt2gODm5tTZoJI 1cv0txtiyhTfelqVaHB9thF3fvFsYhBUGXvSbhpeOfA8FGiR9072wXvSJwJOQRDqEqeE u4ugPtjrfxoraotB8W7uwrgh5+ET71/DbLK1uft1E2IW+UXU6Uv7ePYLYemUw1ityrJ9 UVhG9+VRe1mBoz96FesjhTc8yd1ri5V4IOJHmyFNE0fqCJYHC+Y5qs8IwGh35FxGLiME Coig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q2YrzMgi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ce8si2476913ejb.147.2021.01.07.09.59.01; Thu, 07 Jan 2021 09:59:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Q2YrzMgi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728571AbhAGR5M (ORCPT + 99 others); Thu, 7 Jan 2021 12:57:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727058AbhAGR5M (ORCPT ); Thu, 7 Jan 2021 12:57:12 -0500 Received: from mail-il1-x134.google.com (mail-il1-x134.google.com [IPv6:2607:f8b0:4864:20::134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 342D5C0612F4 for ; Thu, 7 Jan 2021 09:56:30 -0800 (PST) Received: by mail-il1-x134.google.com with SMTP id q1so7570435ilt.6 for ; Thu, 07 Jan 2021 09:56:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gIcTJnmCObWJEcPsEVSzZEbUrwSNec8NhE63eBq7ZSM=; b=Q2YrzMgiAxkXvm9SqzdMmHt/P1Q3M9VnclkYPlj3KXvMS/VdYP3gLVH9iWfi/tPfJW 1ohVoytOusKvdsnudhL748xmMxKTx7YBk99jXoeatAdd9FHMFLFk5l7g/TM/KY5snEdF l0/qgeWJoRyUQi7Wty/0E0ufwlC4/2sLCis8IhzGwT2j7BZDIcWT6iHKVxizOccE+bDH 0BdkWOCmUK0CbbhAzv5C6fAenYzSxPe0C3EOIadGWdvl7ABNS/xyzeo8KoUZvz0EaVSK 5jry93pSQujT1YsqNOeVAY/9W0IB0TYiJtUYwE9AwMoa/JUCLLa7heo9RtcVDBm4m5kE mbeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gIcTJnmCObWJEcPsEVSzZEbUrwSNec8NhE63eBq7ZSM=; b=C5QIyzfWCFth5J8fH+0B0pATaMMIf9vpmKmv4F1N3JXgqQhwjIuUuI5kgL7wwkzvpg cHgBfBcPcd9lpGoIphIoQ5x1+N0feR5uztFSyEjMpGSDDt257pmjjhfsbls1i0jMvW3a OfYwUQYD2dmo5Qz7qOk0bERa3HPqVbBJCvoKqSIvGPNobNf/YmGtI0Dksu72nGsxrMkW +TvkxabfBCdtdK1qxccjEMgtp6551odPz8yF4SWBbeohHMCpzpc5i6vLxnOyK4Gpygjd iZc/boYO0C2NIg4IslgR6DOkShWJt7pZ7K243K0q84oLOrpjES/KR6ykNQdB0JnM+Jo0 qKYw== X-Gm-Message-State: AOAM531egBQ3Nl4h5QYEfir8H5yIvUvyAeVgy8bOoTLJw2DS4FBQbZGL e9k2aaQv/u6g57oHUQXqqPitbSC2PxsTlbCSYgF6usb6+E8= X-Received: by 2002:a92:d592:: with SMTP id a18mr57802iln.64.1610042189444; Thu, 07 Jan 2021 09:56:29 -0800 (PST) MIME-Version: 1.0 References: <20210106035027.GA1160@open-light-1.localdomain> In-Reply-To: From: Alexander Duyck Date: Thu, 7 Jan 2021 09:56:18 -0800 Message-ID: Subject: Re: [PATCH 4/6] hugetlb: avoid allocation failed when page reporting is on going To: Liang Li Cc: Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , linux-mm , LKML , virtualization@lists.linux-foundation.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 6, 2021 at 7:57 PM Liang Li wrote: > > > > Page reporting isolates free pages temporarily when reporting > > > free pages information. It will reduce the actual free pages > > > and may cause application failed for no enough available memory. > > > This patch try to solve this issue, when there is no free page > > > and page repoting is on going, wait until it is done. > > > > > > Cc: Alexander Duyck > > > > Please don't use this email address for me anymore. Either use > > alexander.duyck@gmail.com or alexanderduyck@fb.com. I am getting > > bounces when I reply to this thread because of the old address. > > No problem. > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index eb533995cb49..0fccd5f96954 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -2320,6 +2320,12 @@ struct page *alloc_huge_page(struct vm_area_struct *vma, > > > goto out_uncharge_cgroup_reservation; > > > > > > spin_lock(&hugetlb_lock); > > > + while (h->free_huge_pages <= 1 && h->isolated_huge_pages) { > > > + spin_unlock(&hugetlb_lock); > > > + mutex_lock(&h->mtx_prezero); > > > + mutex_unlock(&h->mtx_prezero); > > > + spin_lock(&hugetlb_lock); > > > + } > > > > This seems like a bad idea. It kind of defeats the whole point of > > doing the page zeroing outside of the hugetlb_lock. Also it is > > operating on the assumption that the only way you might get a page is > > from the page zeroing logic. > > > > With the page reporting code we wouldn't drop the count to zero. We > > had checks that were going through and monitoring the watermarks and > > if we started to hit the low watermark we would stop page reporting > > and just assume there aren't enough pages to report. You might need to > > look at doing something similar here so that you can avoid colliding > > with the allocator. > > For hugetlb, things are a little different, Just like Mike points out: > "On some systems, hugetlb pages are a precious resource and > the sysadmin carefully configures the number needed by > applications. Removing a hugetlb page (even for a very short > period of time) could cause serious application failure." > > Just keeping some pages in the freelist is not enough to prevent that from > happening, because these pages may be allocated while zero out is on > going, and application may still run into a situation for not available free > pages. I get what you are saying. However I don't know if it is acceptable for the allocating thread to be put to sleep in this situation. There are two scenarios where I can see this being problematic. One is a setup where you put the page allocator to sleep and while it is sleeping another thread is then freeing a page and your thread cannot respond to that newly freed page and is stuck waiting on the zeroed page. The second issue is that users may want a different option of just breaking up the request into smaller pages rather than waiting on the page zeroing, or to do something else while waiting on the page. So instead of sitting on the request and waiting it might make more sense to return an error pointer like EAGAIN or EBUSY to indicate that there is a page there, but it is momentarily tied up.