Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752113AbdCTCUE (ORCPT ); Sun, 19 Mar 2017 22:20:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45052 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751421AbdCTCUD (ORCPT ); Sun, 19 Mar 2017 22:20:03 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E485637E64 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=xpang@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com E485637E64 Reply-To: xlpang@redhat.com Subject: Re: [PATCH] x86_64, kexec: Avoid unnecessary identity mappings for kdump References: <1489746150-28364-1-git-send-email-xlpang@redhat.com> <8737ebg5w6.fsf@xmission.com> To: "Eric W. Biederman" , Xunlei Pang Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, akpm@linux-foundation.org, Dave Young , Baoquan He , "H. Peter Anvin" , Ingo Molnar , x86@kernel.org From: Xunlei Pang Message-ID: <58CF3CF4.4060804@redhat.com> Date: Mon, 20 Mar 2017 10:22:44 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <8737ebg5w6.fsf@xmission.com> Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 20 Mar 2017 02:20:03 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3817 Lines: 100 On 03/18/2017 at 01:38 AM, Eric W. Biederman wrote: > Xunlei Pang writes: > >> kexec setups identity mappings for all the memory mapped in 1st kernel, >> this is not necessary for the kdump case. Actually it can cause extra >> memory consumption for paging structures, which is quite considerable >> on modern machines with huge memory. >> >> E.g. On our 24TB machine, it will waste around 96MB (around 4MB/TB) >> from the reserved memory range if setting all the identity mappings. >> >> It also causes some trouble for distributions that use an intelligent >> policy to evaluate the proper "crashkernel=X" for users. >> >> To solve it, in case of kdump, we only setup identity mappings for the >> crash memory and the ISA memory(may be needed by purgatory/kdump >> boot). > How about instead we detect the presence of 1GiB pages and use them > if they are available. We already use 2MiB pages. If we can do that > we will only need about 192K for page tables in the case you have > described and this all becomes a non-issue. > > I strongly suspect that the presence of 24TiB of memory in an x86 system > strongly correlates to the presence of 1GiB pages. > > In principle we certainly can use a less extensive mapping but that > should not be something that differs between the two kexec cases. Ok, will try gbpages for the identity mapping. Regards, Xunlei > I can see forcing the low 1MiB range in. But calling it ISA range is > very wrong and misleading. The reasons that range are special during > boot-up have nothing to do with ISA. But have everything to do with > where legacy page tables are mapped, and where we need identity pages to > start other cpus. I think the only user that actually cares is > purgatory where it plays swapping games with the low 1MiB because we > can't preload what we need to down there or it would mess up the running > kernel. So saying anything about the old ISA bus is wrong and > misleading. At the very very least we need accurate comments. > > Eric > > >> Signed-off-by: Xunlei Pang >> --- >> arch/x86/kernel/machine_kexec_64.c | 34 ++++++++++++++++++++++++++++++---- >> 1 file changed, 30 insertions(+), 4 deletions(-) >> >> diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c >> index 857cdbd..db77a76 100644 >> --- a/arch/x86/kernel/machine_kexec_64.c >> +++ b/arch/x86/kernel/machine_kexec_64.c >> @@ -112,14 +112,40 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable) >> >> level4p = (pgd_t *)__va(start_pgtable); >> clear_page(level4p); >> - for (i = 0; i < nr_pfn_mapped; i++) { >> - mstart = pfn_mapped[i].start << PAGE_SHIFT; >> - mend = pfn_mapped[i].end << PAGE_SHIFT; >> >> + if (image->type == KEXEC_TYPE_CRASH) { >> + /* Always map the ISA range */ >> result = kernel_ident_mapping_init(&info, >> - level4p, mstart, mend); >> + level4p, 0, ISA_END_ADDRESS); >> if (result) >> return result; >> + >> + /* crashk_low_res may not be initialized when reaching here */ >> + if (crashk_low_res.end) { >> + mstart = crashk_low_res.start; >> + mend = crashk_low_res.end + 1; >> + result = kernel_ident_mapping_init(&info, >> + level4p, mstart, mend); >> + if (result) >> + return result; >> + } >> + >> + mstart = crashk_res.start; >> + mend = crashk_res.end + 1; >> + result = kernel_ident_mapping_init(&info, >> + level4p, mstart, mend); >> + if (result) >> + return result; >> + } else { >> + for (i = 0; i < nr_pfn_mapped; i++) { >> + mstart = pfn_mapped[i].start << PAGE_SHIFT; >> + mend = pfn_mapped[i].end << PAGE_SHIFT; >> + >> + result = kernel_ident_mapping_init(&info, >> + level4p, mstart, mend); >> + if (result) >> + return result; >> + } >> } >> >> /*