Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2523618pxb; Sun, 31 Oct 2021 18:20:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwfEvOuAFTLrmMQG0szkEmsIT1yFAKiQ4GCjwc2BqCMPpBkgAnBpkO88gb4Cgh45mB9pSXb X-Received: by 2002:a05:6402:4244:: with SMTP id g4mr35667090edb.158.1635729634081; Sun, 31 Oct 2021 18:20:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635729634; cv=none; d=google.com; s=arc-20160816; b=bLZEVFiH2zoO6MdVs7nXEKKFB3ke6iiG9SUGNtZudfUP5XIvRMYgawLyw1BzxTc3Fn V07ednXlluI3SD5dGfhpnWoPWdeTfoLItnOWB3HdW4aAiTnlClEq8xdZpmUbkk/JvqTM D+LDfXdbEIhIH4bJlGgkednx0SkqOxbcuEralATA2may0+XBCxcPGLN3t1RHTaWLlg66 +73lHVEBNu3LOl/l+sOqMKyQqCnUqGK0uVpQEmBST6BCwZ2cMGHCIzQuFjmv7y6lMljf x2NTQ4QoDR1csTRECFLu+jOIop8eFi4jnVJcUFSRPc+pj5RoKFoqdC7Txwl8zRX5PHWB +Uwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:content-transfer-encoding :message-id:references:subject:cc:to:dkim-signature:dkim-signature :from; bh=2FXTDgcUzpH3THChTljuF3o+vyXS8ekbi4IHYMQwefk=; b=vzzTuQUNeM9lewQJTClDykX+ELPiggB/jbjg1NTdihei4U5WBl/91Vbmeu1OoK4Vp+ tr2XvHhhhtdEwX+1iPHSMma7mESVOMRSqCI2G9Ryj0bQ13eO0BiCVMCfGWH0Jidu9mJc GJkgtfxiNnWwZNFJq4Fie8YvUAQFV60/11hgtShqXiZ5wQnyOPkrpgt0Py/RE04JsO5I gSESjMZMCmygoitKPPytOOzEFRLguNdgSAfMZXlSVf7XDReCmd8v19ixvXLc1eHcwNcN 7J2DJWnIlglaIzazZdmXyYonxsTncKHl4JZak+o0RTZwwe8jrHtrw5HFFPVOBVmeXJ68 uizA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=z1TJMbZh; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=HgdPzSXh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o26si4760622edi.14.2021.10.31.18.20.06; Sun, 31 Oct 2021 18:20:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=z1TJMbZh; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=HgdPzSXh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231312AbhKABSz (ORCPT + 99 others); Sun, 31 Oct 2021 21:18:55 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:39166 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230484AbhKABSl (ORCPT ); Sun, 31 Oct 2021 21:18:41 -0400 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1635729368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=2FXTDgcUzpH3THChTljuF3o+vyXS8ekbi4IHYMQwefk=; b=z1TJMbZhrcACsiNWxJRxr/meSI7HDUquBDZI5yrQs4UisusZv9QUC3KKY+Js+rIrJ/JkUt R4166KNBF06vbxaMiiVKaqkPqxZltkAGzgyhihZ+/xb+v6G0o75kwcWdawKRmsn+YvvIrY tTZLTbOGVs7qqw2COwDVs2SLeDBnpkv9MOKK55pEysdkjcG5WjCNaTlabU/ZCifgri5kib v4Qn9bo8aW9Qv5Z/a+MFZJyNePPPHBzE38ClAEvmlIKsPPanIfyRFFcsEJtCu99t4jKGSI YWumm9H9r+wnK2WeQclyFts3wWUQZho7nPGxnvguRaYNT9ArrF2Iy4tzBEtrLw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1635729368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=2FXTDgcUzpH3THChTljuF3o+vyXS8ekbi4IHYMQwefk=; b=HgdPzSXhZgNWA8eh63tV7De5zNtQKIMJW4C1NkyJOH9W7NjmY5jbeepi3+ZQu0LdY9oarv ZjYqhaf/g/KbHNCQ== To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, x86@kernel.org Subject: [GIT pull] x86/apic for v5.16-rc1 References: <163572864256.3357115.931779940195622047.tglx@xen13> Message-ID: <163572865146.3357115.6271807450024716473.tglx@xen13> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Date: Mon, 1 Nov 2021 02:16:07 +0100 (CET) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus, please pull the latest x86/apic branch from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-apic-2021-11= -01 up to: cc95a07fef06: x86/apic: Reduce cache line misses in __x2apic_send_IPI= _mask() x86/apic related update: - A single commit which reduces cacheline misses in __x2apic_send_IPI_mask() significantly by converting x86_cpu_to_logical_apicid() to an array instead of using per CPU storage. This reduces the cost for a full broadcast on a dual socket system with 256 CPUs from 33 down to 11 microseconds. Thanks, tglx ------------------> Eric Dumazet (1): x86/apic: Reduce cache line misses in __x2apic_send_IPI_mask() arch/x86/kernel/apic/x2apic_cluster.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/apic/x2apic_cluster.c b/arch/x86/kernel/apic/x2a= pic_cluster.c index f4da9bb69a88..e696e22d0531 100644 --- a/arch/x86/kernel/apic/x2apic_cluster.c +++ b/arch/x86/kernel/apic/x2apic_cluster.c @@ -15,9 +15,15 @@ struct cluster_mask { struct cpumask mask; }; =20 -static DEFINE_PER_CPU(u32, x86_cpu_to_logical_apicid); +/* + * __x2apic_send_IPI_mask() possibly needs to read + * x86_cpu_to_logical_apicid for all online cpus in a sequential way. + * Using per cpu variable would cost one cache line per cpu. + */ +static u32 *x86_cpu_to_logical_apicid __read_mostly; + static DEFINE_PER_CPU(cpumask_var_t, ipi_mask); -static DEFINE_PER_CPU(struct cluster_mask *, cluster_masks); +static DEFINE_PER_CPU_READ_MOSTLY(struct cluster_mask *, cluster_masks); static struct cluster_mask *cluster_hotplug_mask; =20 static int x2apic_acpi_madt_oem_check(char *oem_id, char *oem_table_id) @@ -27,7 +33,7 @@ static int x2apic_acpi_madt_oem_check(char *oem_id, char *o= em_table_id) =20 static void x2apic_send_IPI(int cpu, int vector) { - u32 dest =3D per_cpu(x86_cpu_to_logical_apicid, cpu); + u32 dest =3D x86_cpu_to_logical_apicid[cpu]; =20 /* x2apic MSRs are special and need a special fence: */ weak_wrmsr_fence(); @@ -58,7 +64,7 @@ __x2apic_send_IPI_mask(const struct cpumask *mask, int vect= or, int apic_dest) =20 dest =3D 0; for_each_cpu_and(clustercpu, tmpmsk, &cmsk->mask) - dest |=3D per_cpu(x86_cpu_to_logical_apicid, clustercpu); + dest |=3D x86_cpu_to_logical_apicid[clustercpu]; =20 if (!dest) continue; @@ -94,7 +100,7 @@ static void x2apic_send_IPI_all(int vector) =20 static u32 x2apic_calc_apicid(unsigned int cpu) { - return per_cpu(x86_cpu_to_logical_apicid, cpu); + return x86_cpu_to_logical_apicid[cpu]; } =20 static void init_x2apic_ldr(void) @@ -103,7 +109,7 @@ static void init_x2apic_ldr(void) u32 cluster, apicid =3D apic_read(APIC_LDR); unsigned int cpu; =20 - this_cpu_write(x86_cpu_to_logical_apicid, apicid); + x86_cpu_to_logical_apicid[smp_processor_id()] =3D apicid; =20 if (cmsk) goto update; @@ -166,12 +172,21 @@ static int x2apic_dead_cpu(unsigned int dead_cpu) =20 static int x2apic_cluster_probe(void) { + u32 slots; + if (!x2apic_mode) return 0; =20 + slots =3D max_t(u32, L1_CACHE_BYTES/sizeof(u32), nr_cpu_ids); + x86_cpu_to_logical_apicid =3D kcalloc(slots, sizeof(u32), GFP_KERNEL); + if (!x86_cpu_to_logical_apicid) + return 0; + if (cpuhp_setup_state(CPUHP_X2APIC_PREPARE, "x86/x2apic:prepare", x2apic_prepare_cpu, x2apic_dead_cpu) < 0) { pr_err("Failed to register X2APIC_PREPARE\n"); + kfree(x86_cpu_to_logical_apicid); + x86_cpu_to_logical_apicid =3D NULL; return 0; } init_x2apic_ldr();