Received: by 2002:a05:7208:3228:b0:82:47:81bb with SMTP id cb40csp2637820rbb; Tue, 16 Apr 2024 07:29:19 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCX47SF9Rj5SsDfRyYmxeNfXMxVBCN62oFsL66WgJYN+BV5k9CL73zDchaVanLB+Lizq2xg0wXUa1U4Gui5uM7Mwehv1lonMi7/GI1s/DQ== X-Google-Smtp-Source: AGHT+IHMC1rvv46xtOBsWJNMPViAYPkJUpkGNTgErZNmSoDZ3bgY2Zcge0EmPyPlHV7uK25bdVV5 X-Received: by 2002:a50:cd5b:0:b0:570:1ea8:c505 with SMTP id d27-20020a50cd5b000000b005701ea8c505mr5126991edj.26.1713277759467; Tue, 16 Apr 2024 07:29:19 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713277759; cv=pass; d=google.com; s=arc-20160816; b=P4CWUcx8Y90noPw7ffwHXvSmolLIn1DEjcRRh5NAAglNWF4CX87a+9vtpYgrIj5U87 fKoF1LSPpIkQI5CFnJfxeJMtIKBzAZ4pa0RaQs3LoWQfgAHbTIa0HBpjMFLHhXJSVNpk WKLEpiJkNtTDUSR4bEekKyMrH05nVfZWrsrZ2Zq00Kixj/vYeEV8p8AOVHvt8xoquFo5 v6lhF8qm1uw9lKZmLSEvf/LLyMm8D4cHoTskhRfbBmtohF4+8OJazuZnb15Z6cjNHuEM IqEWCoUpR5jgCqGFB0yKNjr7wjiUGRNVqu3tOQ/ceN4zIqs7zW3Gk71YfbNonGlhwxtc Sgew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=0LSNofrv++fMyHcpUMlRhjSwjVweVXRRuk73IUuElOg=; fh=Ncwwj5LwXWpsOM8iSzGwubC+jPlRhyIAx5NSpI5+fjs=; b=wyqIWq2LgwgMPmtfsQcoN8XiDt5wzYXk+iNZ+XBKs7WpmjiqLwX6gvoR6mzPibUwpd fK5o1ULMy1J/k2Vp/5KDgiFML69vpQoUHeHrLF/+jRRUDZvO6r6MTpD1dw1AP8TgptQz Wq1Bkg3vdsJaACepo6aXBVY4L0drmSqxQgcvGuKLvm7yosxAS6VW4nu2gZEtihGAowcW r1gGzV38rdR62PtvyN9eWGBG5I4TDrZY3Axt6oyYojylYGx6arpP4Pg8KEP+dfS2AtnZ Uc/KkUW+lV2bEEKn2Fmc6nypX7+UK8STdg5IPVcpmcwYYu6DKH9/dOFE8zjsqZfrS15/ npGg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=o6i63qqj; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-ext4+bounces-2101-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-ext4+bounces-2101-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id n18-20020a5099d2000000b0056e5c7059f7si6005594edb.590.2024.04.16.07.29.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Apr 2024 07:29:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4+bounces-2101-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=o6i63qqj; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-ext4+bounces-2101-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-ext4+bounces-2101-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 3D1F51F21954 for ; Tue, 16 Apr 2024 14:29:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8D79012F398; Tue, 16 Apr 2024 14:26:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="o6i63qqj" X-Original-To: linux-ext4@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D90412EBF4; Tue, 16 Apr 2024 14:26:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713277566; cv=none; b=Sou2Vt5mTw0T+xqYfXPdUuh+4h7pUvaN5+5EQv+3PiyH32kboSEjfCRt78DMEBgFNZgx4+c/ZI+MYMXZwNDzhwy8Uw60rMmgbAR1iqTWLtk3Ws9AknbcaP4QL7gK678k6n8alKnp7uhiCUjXvrnbQwPy2Preyh7HzswCS7xeDTU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713277566; c=relaxed/simple; bh=+WyAEAGU6dHhLinGMVbyeKjBMJ0Y7Iab03XXzCxzVcU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Jk2k11QhWCJRWmzcT6sV1lFBCJfQvyQYy/AD4umkEz1Q5Zw52UTM2qkDNDYF66ii0Iq1aHaal0E/ze7/tRtO3i3Kpba+n6JZoylZxKwNffK4uIj+nGK3VFAtX6DKXsfcbclam+Zj+M+xnYDFv2NHM+5bOEraT0xYij+7SuJrK1g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=o6i63qqj; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id B21F3C113CE; Tue, 16 Apr 2024 14:26:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1713277565; bh=+WyAEAGU6dHhLinGMVbyeKjBMJ0Y7Iab03XXzCxzVcU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=o6i63qqjHVahoY2VfNPDCyb0QK609jpzZ/ri34IbAojt+ypV2GQclA9EITdH1PB/Z M6f8dRbuu+3UifYh6LaZZAU8NXsVs3918Oe/xEACVlJZnr1BEi0Bp18OB3yZTZD/1Q CgPcVUq2chF13nYnivPvDrSQWN6QLKgMyFfPrVkPCPM0Uu5ZfUGUXdk6KC6yAdQH0O pVt/OdihXNnSPfO9iZyELa9gsZxYDc+iDQm5q0Jf1cI/aud0daCHBfrqyM2zCmhLqP T4q+nEaE/5R7XyW1/jr/PUBeRWiBAWD6LGFmA984CU5+4TqxGGnaTCtnvdzgtT9ude +8314dIjlCCTw== Date: Tue, 16 Apr 2024 17:24:54 +0300 From: Mike Rapoport To: =?iso-8859-1?Q?Bj=F6rn_T=F6pel?= Cc: Christian Brauner , Nam Cao , Andreas Dilger , Al Viro , linux-fsdevel , Jan Kara , Linux Kernel Mailing List , linux-riscv@lists.infradead.org, Theodore Ts'o , Ext4 Developers List , Conor Dooley , "Matthew Wilcox (Oracle)" , Anders Roxell , Alexandre Ghiti Subject: Re: riscv32 EXT4 splat, 6.8 regression? Message-ID: References: <20240416-deppen-gasleitung-8098fcfd6bbd@brauner> <8734rlo9j7.fsf@all.your.base.are.belong.to.us> Precedence: bulk X-Mailing-List: linux-ext4@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8734rlo9j7.fsf@all.your.base.are.belong.to.us> Hi, On Tue, Apr 16, 2024 at 01:02:20PM +0200, Bj?rn T?pel wrote: > Christian Brauner writes: > > > [Adding Mike who's knowledgeable in this area] > > >> > Further, it seems like riscv32 indeed inserts a page like that to the > >> > buddy allocator, when the memblock is free'd: > >> > > >> > | [] __free_one_page+0x2a4/0x3ea > >> > | [] __free_pages_ok+0x158/0x3cc > >> > | [] __free_pages_core+0xe8/0x12c > >> > | [] memblock_free_pages+0x1a/0x22 > >> > | [] memblock_free_all+0x1ee/0x278 > >> > | [] mem_init+0x10/0xa4 > >> > | [] mm_core_init+0x11a/0x2da > >> > | [] start_kernel+0x3c4/0x6de > >> > > >> > Here, a page with VA 0xfffff000 is a added to the freelist. We were just > >> > lucky (unlucky?) that page was used for the page cache. > >> > >> I just educated myself about memory mapping last night, so the below > >> may be complete nonsense. Take it with a grain of salt. > >> > >> In riscv's setup_bootmem(), we have this line: > >> max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end); > >> > >> I think this is the root cause: max_low_pfn indicates the last page > >> to be mapped. Problem is: nothing prevents PFN_DOWN(phys_ram_end) from > >> getting mapped to the last page (0xfffff000). If max_low_pfn is mapped > >> to the last page, we get the reported problem. > >> > >> There seems to be some code to make sure the last page is not used > >> (the call to memblock_set_current_limit() right above this line). It is > >> unclear to me why this still lets the problem slip through. > >> > >> The fix is simple: never let max_low_pfn gets mapped to the last page. > >> The below patch fixes the problem for me. But I am not entirely sure if > >> this is the correct fix, further investigation needed. > >> > >> Best regards, > >> Nam > >> > >> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > >> index fa34cf55037b..17cab0a52726 100644 > >> --- a/arch/riscv/mm/init.c > >> +++ b/arch/riscv/mm/init.c > >> @@ -251,7 +251,8 @@ static void __init setup_bootmem(void) > >> } > >> > >> min_low_pfn = PFN_UP(phys_ram_base); > >> - max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end); > >> + max_low_pfn = PFN_DOWN(memblock_get_current_limit()); > >> + max_pfn = PFN_DOWN(phys_ram_end); > >> high_memory = (void *)(__va(PFN_PHYS(max_low_pfn))); > >> > >> dma32_phys_limit = min(4UL * SZ_1G, (unsigned long)PFN_PHYS(max_low_pfn)); > > Yeah, AFAIU memblock_set_current_limit() only limits the allocation from > memblock. The "forbidden" page (PA 0xc03ff000 VA 0xfffff000) will still > be allowed in the zone. > > I think your patch requires memblock_set_current_limit() is > unconditionally called, which currently is not done. > > The hack I tried was (which seems to work): > > -- > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index fe8e159394d8..3a1f25d41794 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -245,8 +245,10 @@ static void __init setup_bootmem(void) > */ > if (!IS_ENABLED(CONFIG_64BIT)) { > max_mapped_addr = __pa(~(ulong)0); > - if (max_mapped_addr == (phys_ram_end - 1)) > + if (max_mapped_addr == (phys_ram_end - 1)) { > memblock_set_current_limit(max_mapped_addr - 4096); > + phys_ram_end -= 4096; > + } > } You can just memblock_reserve() the last page of the first gigabyte, e.g. if (!IS_ENABLED(CONFIG_64BIT) memblock_reserve(SZ_1G - PAGE_SIZE, PAGE_SIZE); The page will still be mapped, but it will never make it to the page allocator. The nice thing about it is, that memblock lets you to reserve regions that are not necessarily populated, so there's no need to check where the actual RAM ends. > > min_low_pfn = PFN_UP(phys_ram_base); > -- > > I'd really like to see an actual MM person (Mike or Alex?) have some > input here, and not simply my pasta-on-wall approach. ;-) > > > Bj?rn -- Sincerely yours, Mike.