Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp905560pxk; Thu, 10 Sep 2020 01:30:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxgGOP+uMK5ktGOj0//mJAommvMV9krRDBRaDmP9hBGz2wOqFuYfpRAzoqbdKGetP13czpl X-Received: by 2002:a17:906:fb97:: with SMTP id lr23mr7836972ejb.257.1599726647390; Thu, 10 Sep 2020 01:30:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599726647; cv=none; d=google.com; s=arc-20160816; b=RKOxwnJteYfIWC0/WUa+oF73jjM/iKQefwW8tYHfyFbI6GbzK4YjpxfiD4fNraqsuo cjZP503I8yKXpQuvE28C6SR70wdsgccq2VKNdeUVl8gNk4sMJ2gaphZzyLkbMfnEL4ha aiFuCs3tFbGsWRnV70pX2yIPnFAc6DeKBhRhdF9EeLQEdmUDST9o8sNYjPDnLPpUINEW Q7R7cuA6r6Y66YUfkQTICjwuNNMrpSVniJKYoobKS81YySupnPUeKaaocttdYi+CJxnz 0GcDy4jMxklfRAmEaTGh/M1FYb+Ryy0Hweg8UZ2Fi6TS56C+Xz9vm7p2WA24+UtXsBqk 2K3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=mMuvjY8UmayKlxF4iZNnps2KL3bI2ERnkyVdI5CrMLE=; b=KjG5RURH6fxqoym1mmQ7i7WgulYp3JryoTzBeQCuYyMQQ6r+svdFAu2WmUKX4+lQNR 2f997JyXfXiGpSuzTvoHR0qIB6PJYPXzXrJOOvFs3F3MPNDqPyK8ozVGxdAaCEOJP0eI x7LK0M7x8x2z7u2Qkd602kNSihLSkNHu2n/BTpIFS0FKS8a6iyFgebsKQeC1dggRR0Gw JAabO76kSYYpKw380E+LU1/S2A2uReTWnoR5JCZgAZogTccAS5VNwnOFoCpzOkEXPbsk NQyaZETv+vvZhMvZS3MZ6kjU5HPAjrLA+M6i+gyhpRVsERG809ecxr99aq5NbwJXJhtV 7dQQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bh4si3249121ejb.193.2020.09.10.01.30.24; Thu, 10 Sep 2020 01:30:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730180AbgIJI3I (ORCPT + 99 others); Thu, 10 Sep 2020 04:29:08 -0400 Received: from foss.arm.com ([217.140.110.172]:57412 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730338AbgIJI15 (ORCPT ); Thu, 10 Sep 2020 04:27:57 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A531E101E; Thu, 10 Sep 2020 01:27:53 -0700 (PDT) Received: from [192.168.1.179] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4EC993F68F; Thu, 10 Sep 2020 01:27:52 -0700 (PDT) Subject: Re: [PATCH] arm64/mm: add fallback option to allocate virtually contiguous memory To: Sudarshan Rajagopalan , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Catalin Marinas , Will Deacon , Anshuman Khandual , Mark Rutland , Logan Gunthorpe , David Hildenbrand , Andrew Morton References: <01010174769e2b68-a6f3768e-aef8-43c7-b357-a8cb1e17d3eb-000000@us-west-2.amazonses.com> From: Steven Price Message-ID: Date: Thu, 10 Sep 2020 09:27:43 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <01010174769e2b68-a6f3768e-aef8-43c7-b357-a8cb1e17d3eb-000000@us-west-2.amazonses.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/09/2020 07:05, Sudarshan Rajagopalan wrote: > When section mappings are enabled, we allocate vmemmap pages from physically > continuous memory of size PMD_SZIE using vmemmap_alloc_block_buf(). Section > mappings are good to reduce TLB pressure. But when system is highly fragmented > and memory blocks are being hot-added at runtime, its possible that such > physically continuous memory allocations can fail. Rather than failing the > memory hot-add procedure, add a fallback option to allocate vmemmap pages from > discontinuous pages using vmemmap_populate_basepages(). > > Signed-off-by: Sudarshan Rajagopalan > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Anshuman Khandual > Cc: Mark Rutland > Cc: Logan Gunthorpe > Cc: David Hildenbrand > Cc: Andrew Morton > Cc: Steven Price > --- > arch/arm64/mm/mmu.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 75df62f..a46c7d4 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -1100,6 +1100,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, > p4d_t *p4dp; > pud_t *pudp; > pmd_t *pmdp; > + int ret = 0; > > do { > next = pmd_addr_end(addr, end); > @@ -1121,15 +1122,23 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, > void *p = NULL; > > p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap); > - if (!p) > - return -ENOMEM; > + if (!p) { > +#ifdef CONFIG_MEMORY_HOTPLUG > + vmemmap_free(start, end, altmap); > +#endif > + ret = -ENOMEM; > + break; > + } > > pmd_set_huge(pmdp, __pa(p), __pgprot(PROT_SECT_NORMAL)); > } else > vmemmap_verify((pte_t *)pmdp, node, addr, next); > } while (addr = next, addr != end); > > - return 0; > + if (ret) > + return vmemmap_populate_basepages(start, end, node, altmap); > + else > + return ret; Style comment: I find this usage of 'ret' confusing. When we assign -ENOMEM above that is never actually the return value of the function (in that case vmemmap_populate_basepages() provides the actual return value). Also the "return ret" is misleading since we know by that point that ret==0 (and the 'else' is redundant). Can you not just move the call to vmemmap_populate_basepages() up to just after the (possible) vmemmap_free() call and remove the 'ret' variable? AFAICT the call to vmemmap_free() also doesn't need the #ifdef as the function is a no-op if CONFIG_MEMORY_HOTPLUG isn't set. I also feel you need at least a comment to explain Anshuman's point that it looks like you're freeing an unmapped area. Although if I'm reading the code correctly it seems like the unmapped area will just be skipped. Steve > } > #endif /* !ARM64_SWAPPER_USES_SECTION_MAPS */ > void vmemmap_free(unsigned long start, unsigned long end, >