Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp154546rwb; Tue, 4 Oct 2022 01:59:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Fl43V70jPG/z6OfjZlwLVJCsaeBExWBOPoiTmh05bxLR114GvPFFHlH1onLk9GeqOlgwC X-Received: by 2002:a05:6402:3550:b0:451:473a:5ca3 with SMTP id f16-20020a056402355000b00451473a5ca3mr22645790edd.48.1664873994294; Tue, 04 Oct 2022 01:59:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664873994; cv=none; d=google.com; s=arc-20160816; b=xF9cxjmpqi8LED1YuBcL/2PnUz95R6LNj/xgFD08TAuqwlYOxJ3ZZP/tEz+vRLNkYt 0wRZ4xCqwxck2oSSozq9UFT2/f0kuzFEcWXg0uQOckD4BUThv5Xm5jELqqV+l+e1yJ/s 7iLp3aRb9IBWMHbC4+kCtGiBwRzQMuoLiu0DuCrtHCLINBXjGwwfVmKX8n7xMDTcz5Fp h7PRcb8EXrsSJ66+4f1WYKGootM0lzWuhQh/1RO9ZLxasXrQVp9sq/qc7nXONxLi/ZOO ARbFaZzIIr7yhjdQwR8rD8pOvhdRTLYAPWr0e+Cdw/yM/+sHmPM/bPy9s+9iqBCcMDVo IfEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=B4UDd4hN7dTQvOXIaEPV/5i9Gpf+GSG8abRcI7GR1DY=; b=YQcxZDk/DCZMwmGrbOsFD04/Ye8vwtOLjPppZkDIHuP0xasujjb7wDup4PM2sGwshm 2osU0kmMZdFVFTlQo9bQjQ/qJ0v87zDuApThyIf0WN+3r040s/LhAQJAv4Eh2SEG/Z7+ TVRtds/rrA5NVnHkwVS69rqUT687yKTjupi++YOmxeoxSM63U+5puvk6IN3fHwpkl7Rs ysBFpCqsofg01KdINLHHgrmXYRtfsRaiaf2c0q0/c3pxzc+OXtH1c+Ufdrw25tHGhxl7 1qApQRKLiAmzxIp9D2c6ArsCL5mVYXOHqYNgi/HVfqCjqPxhUK+GdeCovgg+o5F7PUK9 v3kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b="PwBP/gYq"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ss2-20020a170907c00200b0073094a56feesi7560688ejc.546.2022.10.04.01.59.28; Tue, 04 Oct 2022 01:59:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b="PwBP/gYq"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230036AbiJDINV (ORCPT + 99 others); Tue, 4 Oct 2022 04:13:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230035AbiJDIMA (ORCPT ); Tue, 4 Oct 2022 04:12:00 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D01D71D0EE for ; Tue, 4 Oct 2022 01:11:53 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E08E81F8DD; Tue, 4 Oct 2022 08:11:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1664871111; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B4UDd4hN7dTQvOXIaEPV/5i9Gpf+GSG8abRcI7GR1DY=; b=PwBP/gYqdXHvj50lqyvLgjcGnDeIuiKOggPBCMVc2y11TkOnProFtSJ8+/H9UHOygWc9C9 H2MixRvfKTmim6aUQ6lme7qPRvqIC6JE2RT0utnp61siN7ZuiKm/MY9oQ3givm6GU626yb HQjT2DjS6attlIZSmnwYGiszmINTbq4= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A72E7139EF; Tue, 4 Oct 2022 08:11:51 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id s4+XJ8fqO2PDSAAAMHmgww (envelope-from ); Tue, 04 Oct 2022 08:11:51 +0000 From: Juergen Gross To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: Juergen Gross , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Subject: [PATCH v4 15/16] x86: do MTRR/PAT setup on all secondary CPUs in parallel Date: Tue, 4 Oct 2022 10:10:22 +0200 Message-Id: <20221004081023.32402-16-jgross@suse.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20221004081023.32402-1-jgross@suse.com> References: <20221004081023.32402-1-jgross@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Instead of serializing MTRR/PAT setup on the secondary CPUs in order to avoid clobbering of static variables used by the setup process, put those variables into a structure held on the stack and drop the serialization. This speeds up the start of secondary cpus a little bit (on a small system with 8 CPUs the time needed for starting the secondary CPUs was measured to go down from about 60 milliseconds without this patch to about 55 milliseconds with this patch applied). Signed-off-by: Juergen Gross --- V4: - new patch --- arch/x86/include/asm/cacheinfo.h | 10 ++++++-- arch/x86/include/asm/mtrr.h | 13 +++++----- arch/x86/kernel/cpu/cacheinfo.c | 28 ++++++++------------- arch/x86/kernel/cpu/mtrr/generic.c | 40 ++++++++++++++---------------- 4 files changed, 45 insertions(+), 46 deletions(-) diff --git a/arch/x86/include/asm/cacheinfo.h b/arch/x86/include/asm/cacheinfo.h index f6c521687535..77ddc04c4975 100644 --- a/arch/x86/include/asm/cacheinfo.h +++ b/arch/x86/include/asm/cacheinfo.h @@ -10,8 +10,14 @@ extern unsigned int memory_caching_control; void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu); void cacheinfo_hygon_init_llc_id(struct cpuinfo_x86 *c, int cpu); -void cache_disable(void); -void cache_enable(void); +struct cache_state { + unsigned long cr4; + u32 mtrr_deftype_lo; + u32 mtrr_deftype_hi; +}; + +void cache_disable(struct cache_state *state); +void cache_enable(struct cache_state *state); void set_cache_aps_delayed_init(void); void cache_bp_init(void); void cache_bp_restore(void); diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index ec73d1e5bafb..67c16c813259 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -23,6 +23,7 @@ #ifndef _ASM_X86_MTRR_H #define _ASM_X86_MTRR_H +#include #include void mtrr_bp_init(void); @@ -45,9 +46,9 @@ extern void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi); extern void mtrr_bp_restore(void); extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); -void mtrr_disable(void); -void mtrr_enable(void); -void mtrr_generic_set_state(void); +void mtrr_disable(struct cache_state *state); +void mtrr_enable(struct cache_state *state); +void mtrr_generic_set_state(struct cache_state *state); # else static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { @@ -84,9 +85,9 @@ static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi) { } #define mtrr_bp_restore() do {} while (0) -#define mtrr_disable() do {} while (0) -#define mtrr_enable() do {} while (0) -#define mtrr_generic_set_state() do {} while (0) +#define mtrr_disable(s) do {} while (0) +#define mtrr_enable(s) do {} while (0) +#define mtrr_generic_set_state(s) do {} while (0) # endif #ifdef CONFIG_COMPAT diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 48ce48827f87..84684b50a5ce 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -1057,10 +1057,7 @@ int populate_cache_leaves(unsigned int cpu) * The caller must ensure that local interrupts are disabled and * are reenabled after cache_enable() has been called. */ -static unsigned long saved_cr4; -static DEFINE_RAW_SPINLOCK(cache_disable_lock); - -void cache_disable(void) __acquires(cache_disable_lock) +void cache_disable(struct cache_state *state) { unsigned long cr0; @@ -1071,8 +1068,6 @@ void cache_disable(void) __acquires(cache_disable_lock) * changes to the way the kernel boots */ - raw_spin_lock(&cache_disable_lock); - /* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */ cr0 = read_cr0() | X86_CR0_CD; write_cr0(cr0); @@ -1088,8 +1083,8 @@ void cache_disable(void) __acquires(cache_disable_lock) /* Save value of CR4 and clear Page Global Enable (bit 7) */ if (boot_cpu_has(X86_FEATURE_PGE)) { - saved_cr4 = __read_cr4(); - __write_cr4(saved_cr4 & ~X86_CR4_PGE); + state->cr4 = __read_cr4(); + __write_cr4(state->cr4 & ~X86_CR4_PGE); } /* Flush all TLBs via a mov %cr3, %reg; mov %reg, %cr3 */ @@ -1097,46 +1092,45 @@ void cache_disable(void) __acquires(cache_disable_lock) flush_tlb_local(); if (boot_cpu_has(X86_FEATURE_MTRR)) - mtrr_disable(); + mtrr_disable(state); /* Again, only flush caches if we have to. */ if (!static_cpu_has(X86_FEATURE_SELFSNOOP)) wbinvd(); } -void cache_enable(void) __releases(cache_disable_lock) +void cache_enable(struct cache_state *state) { /* Flush TLBs (no need to flush caches - they are disabled) */ count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL); flush_tlb_local(); if (boot_cpu_has(X86_FEATURE_MTRR)) - mtrr_enable(); + mtrr_enable(state); /* Enable caches */ write_cr0(read_cr0() & ~X86_CR0_CD); /* Restore value of CR4 */ if (boot_cpu_has(X86_FEATURE_PGE)) - __write_cr4(saved_cr4); - - raw_spin_unlock(&cache_disable_lock); + __write_cr4(state->cr4); } static void cache_cpu_init(void) { unsigned long flags; + struct cache_state state; local_irq_save(flags); - cache_disable(); + cache_disable(&state); if (memory_caching_control & CACHE_MTRR) - mtrr_generic_set_state(); + mtrr_generic_set_state(&state); if (memory_caching_control & CACHE_PAT) pat_cpu_init(); - cache_enable(); + cache_enable(&state); local_irq_restore(flags); } diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 2f2485d6657f..cddb440f330d 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -663,18 +663,13 @@ static bool set_mtrr_var_ranges(unsigned int index, struct mtrr_var_range *vr) return changed; } -static u32 deftype_lo, deftype_hi; - /** * set_mtrr_state - Set the MTRR state for this CPU. * - * NOTE: The CPU must already be in a safe state for MTRR changes, including - * measures that only a single CPU can be active in set_mtrr_state() in - * order to not be subject to races for usage of deftype_lo (this is - * accomplished by taking cache_disable_lock). + * NOTE: The CPU must already be in a safe state for MTRR changes. * RETURNS: 0 if no changes made, else a mask indicating what was changed. */ -static unsigned long set_mtrr_state(void) +static unsigned long set_mtrr_state(struct cache_state *state) { unsigned long change_mask = 0; unsigned int i; @@ -691,38 +686,40 @@ static unsigned long set_mtrr_state(void) * Set_mtrr_restore restores the old value of MTRRdefType, * so to set it we fiddle with the saved value: */ - if ((deftype_lo & 0xff) != mtrr_state.def_type - || ((deftype_lo & 0xc00) >> 10) != mtrr_state.enabled) { - - deftype_lo = (deftype_lo & ~0xcff) | mtrr_state.def_type | - (mtrr_state.enabled << 10); + if ((state->mtrr_deftype_lo & 0xff) != mtrr_state.def_type + || ((state->mtrr_deftype_lo & 0xc00) >> 10) != mtrr_state.enabled) { + state->mtrr_deftype_lo = (state->mtrr_deftype_lo & ~0xcff) | + mtrr_state.def_type | + (mtrr_state.enabled << 10); change_mask |= MTRR_CHANGE_MASK_DEFTYPE; } return change_mask; } -void mtrr_disable(void) +void mtrr_disable(struct cache_state *state) { /* Save MTRR state */ - rdmsr(MSR_MTRRdefType, deftype_lo, deftype_hi); + rdmsr(MSR_MTRRdefType, state->mtrr_deftype_lo, state->mtrr_deftype_hi); /* Disable MTRRs, and set the default type to uncached */ - mtrr_wrmsr(MSR_MTRRdefType, deftype_lo & ~0xcff, deftype_hi); + mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo & ~0xcff, + state->mtrr_deftype_hi); } -void mtrr_enable(void) +void mtrr_enable(struct cache_state *state) { /* Intel (P6) standard MTRRs */ - mtrr_wrmsr(MSR_MTRRdefType, deftype_lo, deftype_hi); + mtrr_wrmsr(MSR_MTRRdefType, state->mtrr_deftype_lo, + state->mtrr_deftype_hi); } -void mtrr_generic_set_state(void) +void mtrr_generic_set_state(struct cache_state *state) { unsigned long mask, count; /* Actually set the state */ - mask = set_mtrr_state(); + mask = set_mtrr_state(state); /* Use the atomic bitops to update the global mask */ for (count = 0; count < sizeof(mask) * 8; ++count) { @@ -747,11 +744,12 @@ static void generic_set_mtrr(unsigned int reg, unsigned long base, { unsigned long flags; struct mtrr_var_range *vr; + struct cache_state state; vr = &mtrr_state.var_ranges[reg]; local_irq_save(flags); - cache_disable(); + cache_disable(&state); if (size == 0) { /* @@ -770,7 +768,7 @@ static void generic_set_mtrr(unsigned int reg, unsigned long base, mtrr_wrmsr(MTRRphysMask_MSR(reg), vr->mask_lo, vr->mask_hi); } - cache_enable(); + cache_enable(&state); local_irq_restore(flags); } -- 2.35.3