Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp24529imm; Thu, 30 Aug 2018 07:30:18 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdbt7WAEiTWnXyzlsaMNOxDcuLj6aXaEe1g2aE8WuAtH5TiEEEPE16ZTmnoMkrQKjZoO/YIJ X-Received: by 2002:a62:404e:: with SMTP id n75-v6mr10726860pfa.232.1535639418756; Thu, 30 Aug 2018 07:30:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535639418; cv=none; d=google.com; s=arc-20160816; b=uhd0Bgse96WvpR84hLMZmi0AeD0/sFb/ogGp7Cd+yjN70L3iiJJZ+m4lSXqqgpFPme yqmWjU1aWxGjZfHEyLi6H288Ds3SDP2yWKKLjcqAmCdFRgPoGT4eL/2NAIz3anjdmCyH xrVaFNAytERy7FqPbzLh3xfQ2mu99DKrsnnOKNnJFLNzxAhq9diYkzw1HWFCZsv0fbp6 Sw1eqRBN4BwfpCUs+Lg4R0jS0kKkKIkWEtGRvitPELK6/Z+sRSE/TX6ZKsXyunSLRXMw Z4/bYyuvpC3Q4lnDcg45r2Rbxehj208uHYSkWRtG2OeFf2D0XmRMGI9Wn/CZAEKBDn+Q sdng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=VgsrWoyaKZvh7SvM4H+aozDRdeobnOHsNA9BiWdAywY=; b=Qcb52hEkYEH9Eo3y9oLwUXRSBA5uWj+IRw1ygiGaH864rbi1Z0TQtdi8/9Ut+Cyzjz 6z7RkA2B34LNCaIerwWnuyCkRBMSPYd4otZxa7eP/sRhmHbHiAkvqLiVS/fG0W9iNRR5 0qiZTfGD2I1ZGF4/7+Cl9bHQCUL93NmlhS09yzyZMujdcz+nYRh6Y9P+CAAyw8ZKoN/M p2JfTum+sMkmPJqVnYljhrUBi6BthSlwcbcg4xM3+S0zXU+IHkhobMeV3rF3YfW7qkmd oLYbqntwzoFsgeAtBOAeefgrswv5LS/lTkLsqqYMZ5p8rTOHQeAae99RG6L6nUJmz/yH V2Vg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t9-v6si6816120pfj.338.2018.08.30.07.30.03; Thu, 30 Aug 2018 07:30:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729280AbeH3SaQ (ORCPT + 99 others); Thu, 30 Aug 2018 14:30:16 -0400 Received: from mga14.intel.com ([192.55.52.115]:48899 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729035AbeH3SaQ (ORCPT ); Thu, 30 Aug 2018 14:30:16 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Aug 2018 07:27:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,307,1531810800"; d="scan'208";a="258514348" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga005.fm.intel.com with ESMTP; 30 Aug 2018 07:27:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id EA4FA110; Thu, 30 Aug 2018 17:27:39 +0300 (EEST) Date: Thu, 30 Aug 2018 17:27:39 +0300 From: "Kirill A. Shutemov" To: Baoquan He Cc: "Kirill A. Shutemov" , tglx@linutronix.de, mingo@kernel.org, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, kexec@lists.infradead.org Subject: Re: [PATCH 0/3] Add restrictions for kexec/kdump jumping between 5-level and 4-level kernel Message-ID: <20180830142739.gfpa23nvex7xbkkf@black.fi.intel.com> References: <20180829141624.13985-1-bhe@redhat.com> <20180830135855.rylamc7mx2ur3tab@kshutemo-mobl1> <20180830141202.GA14702@192.168.1.2> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180830141202.GA14702@192.168.1.2> User-Agent: NeoMutt/20170714-126-deb55f (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 30, 2018 at 02:12:02PM +0000, Baoquan He wrote: > On 08/30/18 at 04:58pm, Kirill A. Shutemov wrote: > > On Wed, Aug 29, 2018 at 10:16:21PM +0800, Baoquan He wrote: > > > This was suggested by Kirill several months ago, I worked out several > > > patches to fix, then interrupted by other issues. So sort them out > > > now and post for reviewing. > > > > Thanks for doing this. > > > > > The current upstream kernel supports 5-level paging mode and supports > > > dynamically choosing paging mode during bootup according to kernel > > > image, hardware and kernel parameter setting. This flexibility brings > > > several issues for kexec/kdump: > > > 1) > > > Switching between paging modes, requires changes into target kernel. > > > It means you cannot kexec() 4-level paging kernel from 5-level paging > > > kernel if 4-level paging kernel doesn't include changes. > > > > > > 2) > > > Switching from 5-level paging to 4-level paging kernel would fail, if > > > kexec() put kernel image above 64TiB of memory. > > > > I'm not entirely sure that 64TiB is the limit here. Technically, 4-level > > paging allows to address 256TiB in 1-to-1 mapping. We just don't have > > machines with that wide physical address space (which don't support > > 5-level paging too). > > Hmm, afaik, the MAX_PHYSMEM_BITS limits the maximum address space > which physical RAM can mapped to. We have 256TB for the whole address > space for 4-level paging, that includes user space and kernel space, > it might not allow 256TB entirely for the direct mapping. > And the direct mapping is only for physical RAM mapping, and > kexec/kdump only cares about the physical RAM space and load them > inside. > > # define MAX_PHYSMEM_BITS (pgtable_l5_enabled() ? 52 : 46) > > Not sure if my understanding is right, please correct me if I am wrong. IIRC, we only care about the place kexec puts the kernel before it gets decompressed. After the decompression kernel will be put into the right spot. Decompression is done in early boot where we use 1-to-1 mapping (not a usual kernel virtual memory layout). All 256TiB should be reachable. Said all that, I think it's safer to stick with 64TiB. For the whole patcheset: Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov