Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751942AbdFZP7D (ORCPT ); Mon, 26 Jun 2017 11:59:03 -0400 Received: from mail.skyhub.de ([5.9.137.197]:53534 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751405AbdFZP6z (ORCPT ); Mon, 26 Jun 2017 11:58:55 -0400 Date: Mon, 26 Jun 2017 17:58:29 +0200 From: Borislav Petkov To: Andy Lutomirski Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds , Andrew Morton , Mel Gorman , "linux-mm@kvack.org" , Nadav Amit , Rik van Riel , Dave Hansen , Arjan van de Ven , Peter Zijlstra Subject: Re: [PATCH v3 11/11] x86/mm: Try to preserve old TLB entries using PCID Message-ID: <20170626155829.4t2axppz7gwf7trd@pd.tnic> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2225 Lines: 61 On Tue, Jun 20, 2017 at 10:22:17PM -0700, Andy Lutomirski wrote: > PCID is a "process context ID" -- it's what other architectures call > an address space ID. Every non-global TLB entry is tagged with a > PCID, only TLB entries that match the currently selected PCID are > used, and we can switch PGDs without flushing the TLB. x86's > PCID is 12 bits. > > This is an unorthodox approach to using PCID. x86's PCID is far too > short to uniquely identify a process, and we can't even really > uniquely identify a running process because there are monster > systems with over 4096 CPUs. To make matters worse, past attempts > to use all 12 PCID bits have resulted in slowdowns instead of > speedups. > > This patch uses PCID differently. We use a PCID to identify a > recently-used mm on a per-cpu basis. An mm has no fixed PCID > binding at all; instead, we give it a fresh PCID each time it's > loaded except in cases where we want to preserve the TLB, in which > case we reuse a recent value. > > This seems to save about 100ns on context switches between mms. "... with my microbenchmark of ping-ponging." :) > > Signed-off-by: Andy Lutomirski > --- > arch/x86/include/asm/mmu_context.h | 3 ++ > arch/x86/include/asm/processor-flags.h | 2 + > arch/x86/include/asm/tlbflush.h | 18 +++++++- > arch/x86/mm/init.c | 1 + > arch/x86/mm/tlb.c | 82 ++++++++++++++++++++++++++-------- > 5 files changed, 86 insertions(+), 20 deletions(-) ... > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h > index 57b305e13c4c..a9a5aa6f45f7 100644 > --- a/arch/x86/include/asm/tlbflush.h > +++ b/arch/x86/include/asm/tlbflush.h > @@ -82,6 +82,12 @@ static inline u64 bump_mm_tlb_gen(struct mm_struct *mm) > #define __flush_tlb_single(addr) __native_flush_tlb_single(addr) > #endif > > +/* > + * 6 because 6 should be plenty and struct tlb_state will fit in > + * two cache lines. > + */ > +#define NR_DYNAMIC_ASIDS 6 TLB_NR_DYN_ASIDS Properly prefixed, I guess. The rest later, when you're done experimenting. :) -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.