Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp1330684rdh; Fri, 24 Nov 2023 09:55:18 -0800 (PST) X-Google-Smtp-Source: AGHT+IFoom1u2wvfaoaletgMtnebSj/H+KJCdkSopf1G0qHg/JIVMkFvR56VnuF7e4X3RAYM4XUQ X-Received: by 2002:a17:90b:4c87:b0:285:5b53:73c6 with SMTP id my7-20020a17090b4c8700b002855b5373c6mr4380405pjb.1.1700848517904; Fri, 24 Nov 2023 09:55:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700848517; cv=none; d=google.com; s=arc-20160816; b=Iqo8kwpCdfW9SnTRGdVRVhqFU6eApyvDWajv+mmKxiI21Tdx1oSVEyCMJyEKmNVKN9 phS3yBc78D3PDvRf8LvfqPxXCoVl/Voy8ipNiUrizAghnboijj4cwO9YSWZLX+M0O0wN sK+k/BFI4S8AJUKos3xTtgn6FPiJ/Sy5jPU/xrsvH59WOjbBhJSiACinPzC3B+e6EIWO SjBbEYCQgjoHjuP6y4/5EGlQzx/JH7NlJ+D3g92de0OlN7hmbpiZ1yKu6QZreQYbhA1l Cd8Ha0vNVbb8lccRvk2Xjz38XVuEOsAMhpjJZDs+U0HxvIFFKy4J+N8MftoZwhuDT8aH V5vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=BFE0t+TRDRGNBpCexvzqDA3fMaHJ8VxILINLTV7YhaI=; fh=w0MEC+f3dLPEZpTsPZaoWp6Nrzv/kimIF1LfQYXiFZw=; b=M4naeYXlXqgnGtGfIq+XnZiDQ6eCIVTnl0uetMulSv53ij5BXfk7I/2dWoOhIxjrxL dn4lSmJ89NBnptT9uFf9e+HRpbHPnZUSM51qcEC2OFtS23XqWbo9lrvzkCnTl/nADu/w uHoYIuSJQ8u5xZHBP+IZM1D/i9PyYa72NSV567xa3GWLhjOnM1Bl2LkMTYB5Hlvz3VYr 9vmmKG3wTSKO4E2dH/jr56QXyxTNP2lldbOVQ+ervFQfxUx4PrIZBC26b77iGQuQN2Wx M3S3/kQEV+YIk5dY4+brkHoNh4FwNzNjenDdwpsI2ZayU7b7jeSmhCmYxoe+j9GzGb86 PeRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=cBBirUIp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id i1-20020a17090a974100b002859aae3eaasi705618pjw.66.2023.11.24.09.55.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 09:55:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=cBBirUIp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 75399806E4FF; Fri, 24 Nov 2023 09:55:15 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230484AbjKXRy7 (ORCPT + 99 others); Fri, 24 Nov 2023 12:54:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229907AbjKXRy6 (ORCPT ); Fri, 24 Nov 2023 12:54:58 -0500 Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07E79189 for ; Fri, 24 Nov 2023 09:55:05 -0800 (PST) Received: by mail-lj1-x232.google.com with SMTP id 38308e7fff4ca-2c5b7764016so24768911fa.1 for ; Fri, 24 Nov 2023 09:55:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1700848503; x=1701453303; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=BFE0t+TRDRGNBpCexvzqDA3fMaHJ8VxILINLTV7YhaI=; b=cBBirUIpsMTyIcS4hPjPSxlka3ie3spxpFbaElhsC2D6uZJdbo/xhRTQpKg5ksnTI4 QHLXj+ggWjDcx02GS4h9yIss5jZU2VHh+8JnaYMz6xTOxFPipssOzJ8xBZnRAdbFyaxi IGxWogYOnUGyVjj4y7+Q26TS9bP5/DveMKG7I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700848503; x=1701453303; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BFE0t+TRDRGNBpCexvzqDA3fMaHJ8VxILINLTV7YhaI=; b=tcR/GByFyFDNzKWj1U7erbb7Nd6qTFzVLwy1eZbQWMnG2CK1JmACcwcIYQx0XGAuYQ hmDUfDJkD54yDYkNBUm8khFNQR62n5/pav8LgQOcbW+ZA8GhdeyDtlPYOEzbrr6p4tCY rf3dpnHUbM7XgS8q/UA7vbaNcaNL3yDJXZHIAzp3hPk8dVHNY/E3dhA8yDURQAKgiZQf C1EF430HFhkmhsFd8jFm8XiVUlODao1g6AFFSMuT46EXrGjSwh8fJkUO6UNfMvOcbEpt 57rKpEiPwOj+oBHH7JUdbTzJw7+j5i0mZ8qlyOB3nota4Ju9H9uTT7cgMo/zUhFfeX42 9ThA== X-Gm-Message-State: AOJu0Yx4C2UnZjKcP/HoMGXUhsl04RcyOiCQnOdSgDqlzL8jRhIcI7H8 /zG0UUuD/KG1KAZjPliFlaP9hQ== X-Received: by 2002:a2e:9346:0:b0:2c8:714f:53a with SMTP id m6-20020a2e9346000000b002c8714f053amr3055274ljh.3.1700848502716; Fri, 24 Nov 2023 09:55:02 -0800 (PST) Received: from google.com ([83.142.187.84]) by smtp.gmail.com with ESMTPSA id t23-20020a2e8e77000000b002c993572c7fsm59450ljk.35.2023.11.24.09.55.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 09:55:02 -0800 (PST) Date: Fri, 24 Nov 2023 18:54:54 +0100 From: Dmytro Maluka To: Michal Hocko Cc: Liu Shixin , Andrew Morton , Greg Kroah-Hartman , huang ying , Aaron Lu , Dave Hansen , Jesper Dangaard Brouer , Vlastimil Babka , Kemi Wang , Kefeng Wang , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH -next v2] mm, proc: collect percpu free pages into the free pages Message-ID: References: <20220822023311.909316-1-liushixin2@huawei.com> <20220822033354.952849-1-liushixin2@huawei.com> <20220822141207.24ff7252913a62f80ea55e90@linux-foundation.org> <6b2977fc-1e4a-f3d4-db24-7c4699e0773f@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=0.4 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FSL_HELO_FAKE,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 24 Nov 2023 09:55:15 -0800 (PST) On Tue, Aug 23, 2022 at 03:37:52PM +0200, Michal Hocko wrote: > On Tue 23-08-22 20:46:43, Liu Shixin wrote: > > On 2022/8/23 15:50, Michal Hocko wrote: > > > On Mon 22-08-22 14:12:07, Andrew Morton wrote: > > >> On Mon, 22 Aug 2022 11:33:54 +0800 Liu Shixin wrote: > > >> > > >>> The page on pcplist could be used, but not counted into memory free or > > >>> avaliable, and pcp_free is only showed by show_mem() for now. Since commit > > >>> d8a759b57035 ("mm, page_alloc: double zone's batchsize"), there is a > > >>> significant decrease in the display of free memory, with a large number > > >>> of cpus and zones, the number of pages in the percpu list can be very > > >>> large, so it is better to let user to know the pcp count. > > >>> > > >>> On a machine with 3 zones and 72 CPUs. Before commit d8a759b57035, the > > >>> maximum amount of pages in the pcp lists was theoretically 162MB(3*72*768KB). > > >>> After the patch, the lists can hold 324MB. It has been observed to be 114MB > > >>> in the idle state after system startup in practice(increased 80 MB). > > >>> > > >> Seems reasonable. > > > I have asked in the previous incarnation of the patch but haven't really > > > received any answer[1]. Is this a _real_ problem? The absolute amount of > > > memory could be perceived as a lot but is this really noticeable wrt > > > overall memory on those systems? Let me provide some other numbers, from the desktop side. On a low-end chromebook with 4GB RAM and a dual-core CPU, after commit b92ca18e8ca5 (mm/page_alloc: disassociate the pcp->high from pcp->batch) the max amount of PCP pages increased 56x times: from 2.9MB (1.45 per CPU) to 165MB (82.5MB per CPU). On such a system, memory pressure conditions are not a rare occurrence, so several dozen MB make a lot of difference. (The reason it increased so much is because it now corresponds to the low watermark, which is 165MB. And the low watermark, in turn, is so high because of khugepaged, which bumps up min_free_kbytes to 132MB regardless of the total amount of memory.) > > This may not obvious when the memory is sufficient. However, as products monitor the > > memory to plan it. The change has caused warning. > > Is it possible that the said monitor is over sensitive and looking at > wrong numbers? Overall free memory doesn't really tell much TBH. > MemAvailable is a very rough estimation as well. > > In reality what really matters much more is whether the memory is > readily available when it is required and none of MemFree/MemAvailable > gives you that information in general case. > > > We have also considered using /proc/zoneinfo to calculate the total > > number of pcplists. However, we think it is more appropriate to add > > the total number of pcplists to free and available pages. After all, > > this part is also free pages. > > Those free pages are not generally available as exaplained. They are > available to a specific CPU, drained under memory pressure and other > events but still there is no guarantee a specific process can harvest > that memory because the pcp caches are replenished all the time. > So in a sense it is a semi-hidden memory. I was intuitively assuming that per-CPU pages should be always available for allocation without resorting to paging out allocated pages (and thus it should be non-controversially a good idea to include per-CPU pages in MemFree, to make it more accurate). But looking at the code in __alloc_pages() and around, I see you are right: we don't try draining other CPUs' PCP lists *before* resorting to direct reclaim, compaction etc. BTW, why not? Shouldn't draining PCP lists be cheaper than pageout() in any case? > That being said, I am still not convinced this is actually going to help > all that much. You will see a slightly different numbers which do not > tell much one way or another and if the sole reason for tweaking these > numbers is that some monitor is complaining because X became X-epsilon > then this sounds like a weak justification to me. That epsilon happens > all the time because there are quite some hidden caches that are > released under memory pressure. I am not sure it is maintainable to > consider each one of them and pretend that MemFree/MemAvailable is > somehow precise. It has never been and likely never will be. > -- > Michal Hocko > SUSE Labs