Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2285312rwd; Wed, 17 May 2023 08:06:54 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5uxxEMdAsFMnBA71vOhXI80khPMykaWMJaH6SDiLF2tLH1/uS6jNuSfJ9uf1iEB2vlBqKP X-Received: by 2002:a05:6a20:6a25:b0:101:4348:3e4e with SMTP id p37-20020a056a206a2500b0010143483e4emr40543774pzk.42.1684336014150; Wed, 17 May 2023 08:06:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684336014; cv=none; d=google.com; s=arc-20160816; b=dxqWf5NUYEHxS99nYMeF6dw9aJtbThQwShhhssbGwmq5HzBMlsxaYYdHIzdZBy4SXc VDTCfSTmQzrPO3oNNKWqQ3bEh8kg5pmqwcgDaF1x2mg/UKJSaGmxGFW7Ftrj73arfUGD /TxB8HD4e3ZTCcfZMch8ymSC6NhLlU3vjEBXy1Kr5zgiXRNWjuSacHH0XbSVP4A7VBwB APg1OqVtk6bLYaKLDcQ430KXpJ2WXp2f/4Rb4f1nxMdy+LAn9X0id5PqEQGT6b5x/QHL LIEs+NWwe5d+Rl6tIDvbLHPmJTH5BkS1rI7tPfqMHkBeYfgKDJl1Ipvvx9jlhZGkhmSf OSvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=qy01pDU5gpvvu40QdwUDSi9Rn8BJ+AlfGupuASEwzFs=; b=lrk5KPGEb1YQSFhKzLjfPpRZrAt85FPQDwr3gz9wgGnxf3Ql8aeFrrBxAjlAATNq9F lX21aEK08rk+x+tnKo5OueKost1mvYm2XFYWMxl4RMaWbfTzMWHp2MrnOhEeFcp37X2D Xk7qRNEQ9mVt15bWMSKf0xTNo/hw08DPSDnKdDUlhA53xaAovniECs1L69dO5dLhv/je 7a2WtdSlRHcshSG9KlAF3ehfDUCctCeEwmyHmNmaU26BAL57oTnzCZ6DrpfLxtliGcZ5 SUh7MB9vVZuUTB/2iQ5b57+Z4z5Sgm8RiEXc9qkdgEnrWnIzdgnLgNo/nHHZX3wDDDee YX5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=jaU0MTMc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z187-20020a6333c4000000b00530802be101si15438643pgz.424.2023.05.17.08.06.41; Wed, 17 May 2023 08:06:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=jaU0MTMc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231604AbjEQO4c (ORCPT + 99 others); Wed, 17 May 2023 10:56:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231238AbjEQO4a (ORCPT ); Wed, 17 May 2023 10:56:30 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 425716186 for ; Wed, 17 May 2023 07:56:06 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3f50020e0f8so65402455e9.0 for ; Wed, 17 May 2023 07:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1684335364; x=1686927364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=qy01pDU5gpvvu40QdwUDSi9Rn8BJ+AlfGupuASEwzFs=; b=jaU0MTMcpsi5e11qScCf/2APvXYvWwIWHm5HZITAA/TYgvG+TzWJjw1k9KkADFndAl hGog+rgFaQ6At23WsARj/362a3S0xlbQkZQsvjRnc11X4h1HUeDg5CdotjmZoIAuqJmR /mpGK2vT0D9xt4tdnsNet9liylvPFg5KI6l+v6OgfkVl6IFoJXVsHhhUZVdoOjN0TsNS 0l7+1u5NRev4Ky5IwSHIV2yfOI4lwuQ99UD0OVY6ifa1QaXiQ76VTsiB2hiS2Wr9kJur W+58TcD/EI3VxnCU77BXtx8wIHmn1c8cnGJzuF2/8eV9ZoOXltrVxeg61GmtC0asAWz8 BtOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684335364; x=1686927364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qy01pDU5gpvvu40QdwUDSi9Rn8BJ+AlfGupuASEwzFs=; b=LjyxXVia4VXhTTtbUoQDGhVXmsIXI0UXgeurY2BOdVfGXlcV+aoErrUfawHfce602q ISzEJc45/it7cYBegXJAuidY5xFozBTqPjskMrO6oW90WCeC8tTtz4dUqgJ0ApGlOR1v M6ygmOY620w6sz7Ff9hsjFq7tEOuJCq3gNIpTpIJpaBhqYnHtTi/Q92OqhlhgMw2HHur dvbPuPZgOC7ZJt43nx7gM25RjtO/eVzTm+gYIKX3SkICOvWJ2Dit/yc0FZB0rg1DeOct mFZComBNxMwM8RdonB3d17D+YkfnoAdFEptNTyJrv3Z0Qmhx9Ba0FteoGS8Bzsndn1RB 8YYw== X-Gm-Message-State: AC+VfDysMdmNa7o9RG+REfMEZaai+ok3TaL2sa1rhHEd+GxjNeLYfTdq 6bye8hT0Xr22tq/GUj2VWDFPfMIstjoqIKn6bPWCdg== X-Received: by 2002:a1c:f613:0:b0:3f3:3cba:2f2d with SMTP id w19-20020a1cf613000000b003f33cba2f2dmr2096949wmc.7.1684335364254; Wed, 17 May 2023 07:56:04 -0700 (PDT) MIME-Version: 1.0 References: <20230517-preacher-primer-f41020b3376a@wendy> In-Reply-To: <20230517-preacher-primer-f41020b3376a@wendy> From: Alexandre Ghiti Date: Wed, 17 May 2023 16:55:53 +0200 Message-ID: Subject: Re: Bug report: kernel paniced when system hibernates To: Conor Dooley Cc: Song Shuai , robh@kernel.org, Andrew Jones , anup@brainfault.org, palmer@rivosinc.com, jeeheng.sia@starfivetech.com, leyfoon.tan@starfivetech.com, mason.huo@starfivetech.com, Paul Walmsley , Guo Ren , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 17, 2023 at 1:28=E2=80=AFPM Conor Dooley wrote: > > Hey Alex, > > On Wed, May 17, 2023 at 10:58:02AM +0200, Alexandre Ghiti wrote: > > On Tue, May 16, 2023 at 1:12=E2=80=AFPM Alexandre Ghiti wrote: > > > > On Tue, May 16, 2023 at 11:24=E2=80=AFAM Song Shuai wrote: > > > I actually removed this flag a few years ago, and I have to admit tha= t > > > I need to check if that's necessary: the goal of commit 3335068f8721 > > > ("riscv: Use PUD/P4D/PGD pages for the linear mapping") is to expose > > > the "right" start of DRAM so that we can align virtual and physical > > > addresses on a 1GB boundary. > > > > > > So I have to check if a nomap region is actually added as a > > > memblock.memory.regions[] or not: if yes, that's perfect, let's add > > > the nomap attributes to the PMP regions, otherwise, I don't think tha= t > > > is a good solution. > > > > So here is the current linear mapping without nomap in openSBI: > > > > ---[ Linear mapping ]--- > > 0xff60000000000000-0xff60000000200000 0x0000000080000000 2M > > PMD D A G . . W R V > > 0xff60000000200000-0xff60000000e00000 0x0000000080200000 12M > > PMD D A G . . . R V > > > > And below the linear mapping with nomap in openSBI: > > > > ---[ Linear mapping ]--- > > 0xff60000000080000-0xff60000000200000 0x0000000080080000 1536K > > PTE D A G . . W R V > > 0xff60000000200000-0xff60000000e00000 0x0000000080200000 12M > > PMD D A G . . . R V > > > > So adding nomap does not misalign virtual and physical addresses, it > > prevents the usage of 1GB page for this area though, so that's a > > solution, we just lose this 1GB page here. > > > > But even though that may be the fix, I think we also need to fix that > > in the kernel as it would break compatibility with certain versions of > > openSBI *if* we fix openSBI...So here are a few solutions: > > > > 1. we can mark all "mmode_resv" nodes in the device tree as nomap, > > before the linear mapping is established (IIUC, those nodes are added > > by openSBI to advertise PMP regions) > > -> This amounts to the same fix as opensbi and we lose the 1GB huge= page. > > AFAIU, losing the 1 GB hugepage is a regression, which would make this > not an option, right? Not sure this is a real regression, I'd rather avoid it, but as mentioned in my first answer, Mike Rapoport showed that it was making no difference performance-wise... > > > 2. we can tweak pfn_is_nosave function to *not* save pfn corresponding > > to PMP regions > > -> We don't lose the 1GB hugepage \o/ > > 3. we can use register_nosave_region() to not save the "mmode_resv" > > regions (x86 does that > > https://elixir.bootlin.com/linux/v6.4-rc1/source/arch/x86/kernel/e820.c= #L753) > > -> We don't lose the 1GB hugepage \o/ > > 4. Given JeeHeng pointer to > > https://elixir.bootlin.com/linux/v6.4-rc1/source/kernel/power/snapshot.= c#L1340, > > we can mark those pages as non-readable and make the hibernation > > process not save those pages > > -> Very late-in-the-day idea, not sure what it's worth, we also > > lose the 1GB hugepage... > > Ditto here re: introducing another regression. > > > To me, the best solution is 3 as it would prepare for other similar > > issues later, it is similar to x86 and it allows us to keep 1GB > > hugepages. > > > > I have been thinking, and to me nomap does not provide anything since > > the kernel should not address this memory range, so if it does, we > > must fix the kernel. > > > > Let me know what you all think, I'll be preparing a PoC of 3 in the mea= ntime! > > #3 would probably get my vote too. It seems like you could use it > dynamically if there was to be a future other provider of "mmode_resv" > regions, rather than doing something location-specific. > > We should probably document these opensbi reserved memory nodes though > in a dt-binding or w/e if we are going to be relying on them to not > crash! Yes, you're right, let's see what Atish and Anup think! Thanks for your quick answers Conor and Song, really appreciated! Alex > > Thanks for working on this, > Conor. >