Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756172AbZAMMA1 (ORCPT ); Tue, 13 Jan 2009 07:00:27 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752049AbZAMMAP (ORCPT ); Tue, 13 Jan 2009 07:00:15 -0500 Received: from bombadil.infradead.org ([18.85.46.34]:47505 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751934AbZAMMAO (ORCPT ); Tue, 13 Jan 2009 07:00:14 -0500 Subject: Re: [patch] tlb flush_data: replace per_cpu with an array From: Peter Zijlstra To: Ingo Molnar Cc: Ravikiran G Thirumalai , Frederik Deweerdt , andi@firstfloor.org, tglx@linutronix.de, hpa@zytor.com, linux-kernel@vger.kernel.org In-Reply-To: <20090112230052.GB18771@elte.hu> References: <20090112213539.GA10720@gambetta> <20090112223421.GA20594@localdomain> <20090112230052.GB18771@elte.hu> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Tue, 13 Jan 2009 13:00:16 +0100 Message-Id: <1231848016.442.112.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.24.2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4020 Lines: 101 On Tue, 2009-01-13 at 00:00 +0100, Ingo Molnar wrote: > From 23d9dc8bffc759c131b09a48b5215cc2b37a5ac3 Mon Sep 17 00:00:00 2001 > From: Frederik Deweerdt > Date: Mon, 12 Jan 2009 22:35:42 +0100 > Subject: [PATCH] x86, tlb flush_data: replace per_cpu with an array > > Impact: micro-optimization, memory reduction > > On x86_64 flush tlb data is stored in per_cpu variables. This is > unnecessary because only the first NUM_INVALIDATE_TLB_VECTORS entries > are accessed. > > This patch aims at making the code less confusing (there's nothing > really "per_cpu") by using a plain array. It also would save some memory > on most distros out there (Ubuntu x86_64 has NR_CPUS=64 by default). > > [ Ravikiran G Thirumalai also pointed out that the correct alignment > is ____cacheline_internodealigned_in_smp, so that there's no > bouncing on vsmp. ] > > Signed-off-by: Frederik Deweerdt > Signed-off-by: Ingo Molnar > --- > arch/x86/kernel/tlb_64.c | 16 ++++++++-------- > 1 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/kernel/tlb_64.c b/arch/x86/kernel/tlb_64.c > index f8be6f1..c5a6c6f 100644 > --- a/arch/x86/kernel/tlb_64.c > +++ b/arch/x86/kernel/tlb_64.c > @@ -33,7 +33,7 @@ > * To avoid global state use 8 different call vectors. > * Each CPU uses a specific vector to trigger flushes on other > * CPUs. Depending on the received vector the target CPUs look into > - * the right per cpu variable for the flush data. > + * the right array slot for the flush data. > * > * With more than 8 CPUs they are hashed to the 8 available > * vectors. The limited global vector space forces us to this right now. > @@ -48,13 +48,13 @@ union smp_flush_state { > unsigned long flush_va; > spinlock_t tlbstate_lock; > }; > - char pad[SMP_CACHE_BYTES]; > -} ____cacheline_aligned; > + char pad[X86_INTERNODE_CACHE_BYTES]; > +} ____cacheline_internodealigned_in_smp; That will make the below array 8*4096 bytes for VSMP, which pushes the limit for memory savings up to 256 cpus. I'm really dubious this patch is really worth it. > /* State is put into the per CPU data section, but padded > to a full cache line because other CPUs can access it and we don't > want false sharing in the per cpu data segment. */ > -static DEFINE_PER_CPU(union smp_flush_state, flush_state); > +static union smp_flush_state flush_state[NUM_INVALIDATE_TLB_VECTORS]; > > /* > * We cannot call mmdrop() because we are in interrupt context, > @@ -129,7 +129,7 @@ asmlinkage void smp_invalidate_interrupt(struct pt_regs *regs) > * Use that to determine where the sender put the data. > */ > sender = ~regs->orig_ax - INVALIDATE_TLB_VECTOR_START; > - f = &per_cpu(flush_state, sender); > + f = &flush_state[sender]; > > if (!cpu_isset(cpu, f->flush_cpumask)) > goto out; > @@ -169,7 +169,7 @@ void native_flush_tlb_others(const cpumask_t *cpumaskp, struct mm_struct *mm, > > /* Caller has disabled preemption */ > sender = smp_processor_id() % NUM_INVALIDATE_TLB_VECTORS; > - f = &per_cpu(flush_state, sender); > + f = &flush_state[sender]; > > /* > * Could avoid this lock when > @@ -205,8 +205,8 @@ static int __cpuinit init_smp_flush(void) > { > int i; > > - for_each_possible_cpu(i) > - spin_lock_init(&per_cpu(flush_state, i).tlbstate_lock); > + for (i = 0; i < ARRAY_SIZE(flush_state); i++) > + spin_lock_init(&flush_state[i].tlbstate_lock); > > return 0; > } > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/