Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1007703pxb; Thu, 28 Jan 2021 06:06:06 -0800 (PST) X-Google-Smtp-Source: ABdhPJx/hBzTaCLCpzezZUWSMpfkGZjkPq6AE/4bviLBe8cadzwG0EUOAHHUt6ihYNoLbi6QsfIm X-Received: by 2002:a17:906:4442:: with SMTP id i2mr9502733ejp.41.1611842766607; Thu, 28 Jan 2021 06:06:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611842766; cv=none; d=google.com; s=arc-20160816; b=gRmrYX4VqrWCodcicTQefB7+zDHO7YJnOP+JoNZPO0ZNwOV8U7JF7VcmbLYsKWuRpO ioeKWhBxr1/wUv4Emu5uYAM3jdQTpcX21903o8MPofoykg4rQQMeh3uFHNy7IgJfRt8n EU93rq+K1aP0B2j+AKm2CP+zWXEPzrxXhPsgC7vovSUczI1SqOIUqFq4sNYAi4pCC1bu kwg/3eZpYekRqwaG0XXF+QVAEC0OsWAIBNb0vHzFkczKW22zFt611FMAEnfVwf7bMBs9 IMEgA+qB0CAaZJykO7jbaZBGs5Yf/VoEbpVuglsiTKijeVczQfmHxawteR/FXpPEIYh0 O6FA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject:dkim-signature; bh=fbXYp+G2hIc1hJPA+EOCFA3CI2N+wB/voZbfXFSnlvA=; b=LXztsg1b0WCqF+Zv9VZu8vBwfOTzePuV7fgloeRcFZWB3tKFJ4bIu9hklmFRwnMhQk KO32PGrurmah9Zx2YcwZS5lRynn1H+rqUAb1TyJCE2MEl88v4ucDSYm/6af2xYIHMydl TANsFaPJw17xhMbkk5fdBZA7wzmThMuvaW3p7oUgLNeEPpGLcYNgnchPmOpz7wW6MUfx QHoVi/vWOD8PWpEZVpFSsxw2SjCZo/owKO/fM84TUyWrGY6tTr4HtibShwPa9r8I4WNu BdKmvjaAHyCo/32HE+qB8cU5ZRLoBOKPXCr1V5/G70RXdslt/lKhg2fyD0l6F+euOP8l uE9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SbwShjCN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a12si2989202edr.294.2021.01.28.06.05.38; Thu, 28 Jan 2021 06:06:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SbwShjCN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229791AbhA1ODW (ORCPT + 99 others); Thu, 28 Jan 2021 09:03:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:60525 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229677AbhA1ODU (ORCPT ); Thu, 28 Jan 2021 09:03:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611842513; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fbXYp+G2hIc1hJPA+EOCFA3CI2N+wB/voZbfXFSnlvA=; b=SbwShjCNARPInuvA2+ldEG9SxVUQz8QS73q9GWr520ZN/x3P+9yR5a9HpJTpUrM7e/S5zQ iL3wA76hWvZc/PJSTjQzNF7hYOZR9HIolzwfUzQveaD1lkWa85FEugT79Ti0CN6ISVDGhM aCPaIL19ktsSp7B37dVPSMrxntVouAk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-344-j3bcSDrHN7mgzQimo5NYmg-1; Thu, 28 Jan 2021 09:01:51 -0500 X-MC-Unique: j3bcSDrHN7mgzQimo5NYmg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 71ACD1005313; Thu, 28 Jan 2021 14:01:49 +0000 (UTC) Received: from [10.36.113.207] (ovpn-113-207.ams2.redhat.com [10.36.113.207]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D0ED5C1BB; Thu, 28 Jan 2021 14:01:47 +0000 (UTC) Subject: Re: [PATCH v1 2/2] mm/page_alloc: count CMA pages per zone and print them in /proc/zoneinfo To: Oscar Salvador Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Thomas Gleixner , "Peter Zijlstra (Intel)" , Mike Rapoport , Michal Hocko , Wei Yang References: <20210127101813.6370-1-david@redhat.com> <20210127101813.6370-3-david@redhat.com> <20210128102234.GB5250@localhost.localdomain> <2246d657-4f6d-c27d-4ae2-853a8437cda9@redhat.com> <20210128134458.GA8136@localhost.localdomain> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: Date: Thu, 28 Jan 2021 15:01:46 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20210128134458.GA8136@localhost.localdomain> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28.01.21 14:44, Oscar Salvador wrote: > On Thu, Jan 28, 2021 at 11:43:41AM +0100, David Hildenbrand wrote: >>> My knowledge of CMA tends to be quite low, actually I though that CMA >>> was somehow tied to ZONE_MOVABLE. >> >> CMA is often placed into one of the kernel zones, but can also end up in the movable zone. > > Ok good to know. > >>> I see how tracking CMA pages per zona might give you a clue, but what do >>> you mean by "might behave differently - even after some of these pages might >>> already have been allocated" >> >> Assume you have 4GB in ZONE_NORMAL but 1GB is assigned for CMA. You actually only have 3GB available for random kernel allocations, not 4GB. >> >> Currently, you can only observe the free CMA pages, excluding any pages that are already allocated. Having that information how many CMA pages we have can be helpful - similar to what we already have in /proc/meminfo. > > I see, I agree that it can provide some guidance. > >>> I see that NR_FREE_CMA_PAGES is there even without CONFIG_CMA, as you >>> said, but I am not sure about adding size to a zone unconditionally. >>> I mean, it is not terrible as IIRC, the maximum MAX_NUMNODES can get >>> is 1024, and on x86_64 that would be (1024 * 4 zones) * 8 = 32K. >>> So not a big deal, but still. >> >> I'm asking myself how many such systems will run without >> CONFIG_CMA in the future. > > I am not sure, my comment was just to point out that even the added size might > not be that large, hiding it under CONFIG_CMA seemed the right thing to > do. > >>> diff --git a/mm/vmstat.c b/mm/vmstat.c >>> index 8ba0870ecddd..5757df4bfd45 100644 >>> --- a/mm/vmstat.c >>> +++ b/mm/vmstat.c >>> @@ -1559,13 +1559,15 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat, >>> "\n spanned %lu" >>> "\n present %lu" >>> "\n managed %lu", >>> + "\n cma %lu", >>> zone_page_state(zone, NR_FREE_PAGES), >>> min_wmark_pages(zone), >>> low_wmark_pages(zone), >>> high_wmark_pages(zone), >>> zone->spanned_pages, >>> zone->present_pages, >>> - zone->managed_pages); >>> + zone->managed_pages, >>> + IS_ENABLED(CONFIG_CMA) ? zone->cma_pages : 0); >>> seq_printf(m, >>> "\n protection: (%ld", >>> >>> >>> I do not see it that ugly, but just my taste. >> >> IIRC, that does not work. The compiler will still complain >> about a missing struct members. We would have to provide a >> zone_cma_pages() helper with some ifdefery. > > Of course, it seems I switched off my brain. > >> We could do something like this on top >> >> --- a/include/linux/mmzone.h >> +++ b/include/linux/mmzone.h >> @@ -530,7 +530,9 @@ struct zone { >> atomic_long_t managed_pages; >> unsigned long spanned_pages; >> unsigned long present_pages; >> +#ifdef CONFIG_CMA >> unsigned long cma_pages; >> +#endif >> const char *name; >> diff --git a/mm/vmstat.c b/mm/vmstat.c >> index 97fc32a53320..b753a64f099f 100644 >> --- a/mm/vmstat.c >> +++ b/mm/vmstat.c >> @@ -1643,7 +1643,10 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat, >> "\n spanned %lu" >> "\n present %lu" >> "\n managed %lu" >> - "\n cma %lu", >> +#ifdef CONFIG_CMA >> + "\n cma %lu" >> +#endif >> + "%s", >> zone_page_state(zone, NR_FREE_PAGES), >> min_wmark_pages(zone), >> low_wmark_pages(zone), >> @@ -1651,7 +1654,10 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat, >> zone->spanned_pages, >> zone->present_pages, >> zone_managed_pages(zone), >> - zone->cma_pages); >> +#ifdef CONFIG_CMA >> + zone->cma_pages, >> +#endif >> + ""); >> seq_printf(m, >> "\n protection: (%ld", > > Looks good to me, but I can see how those #ifdef can raise some > eyebrows. We could print it further above to avoid the "%s" ... "", or print it separately below. Then we'd only need a single ifdef. Might make sense > Let us see what other thinks as well. > > Btw, should linux-uapi be CCed, as /proc/vmstat layout will change? Is there a linux-uapi@ list? I know linux-api@ ("forum to discuss changes that affect the Linux programming interface (API or ABI)". Good question, I can certainly cc linux-api@, although I doubt it's strictly necessary when adding something here. Thanks! -- Thanks, David / dhildenb