Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752478AbdFVCq2 (ORCPT ); Wed, 21 Jun 2017 22:46:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:58734 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752204AbdFVCq1 (ORCPT ); Wed, 21 Jun 2017 22:46:27 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BEDCE22B49 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org MIME-Version: 1.0 In-Reply-To: <20170621184424.eixb2jdyy66xq4hg@pd.tnic> References: <91f24a6145b2077f992902891f8fa59abe5c8696.1498022414.git.luto@kernel.org> <20170621184424.eixb2jdyy66xq4hg@pd.tnic> From: Andy Lutomirski Date: Wed, 21 Jun 2017 19:46:05 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 05/11] x86/mm: Track the TLB's tlb_gen and update the flushing algorithm To: Borislav Petkov Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , Linus Torvalds , Andrew Morton , Mel Gorman , "linux-mm@kvack.org" , Nadav Amit , Rik van Riel , Dave Hansen , Arjan van de Ven , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3377 Lines: 95 On Wed, Jun 21, 2017 at 11:44 AM, Borislav Petkov wrote: > On Tue, Jun 20, 2017 at 10:22:11PM -0700, Andy Lutomirski wrote: >> + this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, next->context.ctx_id); >> + this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, >> + atomic64_read(&next->context.tlb_gen)); > > Just let it stick out: > > this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, next->context.ctx_id); > this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, atomic64_read(&next->context.tlb_gen)); > > Should be a bit better readable this way. Done >> + if (local_tlb_gen == mm_tlb_gen) { > > if (unlikely(... > > maybe? > > Sounds to me like the concurrent flushes case would be the > uncommon one... Agreed. >> + >> + WARN_ON_ONCE(local_tlb_gen > mm_tlb_gen); >> + WARN_ON_ONCE(f->new_tlb_gen > mm_tlb_gen); >> + >> + /* >> + * If we get to this point, we know that our TLB is out of date. >> + * This does not strictly imply that we need to flush (it's >> + * possible that f->new_tlb_gen <= local_tlb_gen), but we're >> + * going to need to flush in the very near future, so we might >> + * as well get it over with. >> + * >> + * The only question is whether to do a full or partial flush. >> + * >> + * A partial TLB flush is safe and worthwhile if two conditions are >> + * met: >> + * >> + * 1. We wouldn't be skipping a tlb_gen. If the requester bumped >> + * the mm's tlb_gen from p to p+1, a partial flush is only correct >> + * if we would be bumping the local CPU's tlb_gen from p to p+1 as >> + * well. >> + * >> + * 2. If there are no more flushes on their way. Partial TLB >> + * flushes are not all that much cheaper than full TLB >> + * flushes, so it seems unlikely that it would be a >> + * performance win to do a partial flush if that won't bring >> + * our TLB fully up to date. >> + */ >> + if (f->end != TLB_FLUSH_ALL && >> + f->new_tlb_gen == local_tlb_gen + 1 && >> + f->new_tlb_gen == mm_tlb_gen) { > > I'm certainly still missing something here: > > We have f->new_tlb_gen and mm_tlb_gen to control the flushing, i.e., we > do once > > bump_mm_tlb_gen(mm); > > and once > > info.new_tlb_gen = bump_mm_tlb_gen(mm); > > and in both cases, the bumping is done on mm->context.tlb_gen. > > So why isn't that enough to do the flushing and we have to consult > info.new_tlb_gen too? The issue is a possible race. Suppose we start at tlb_gen == 1 and then two concurrent flushes happen. The first flush is a full flush and sets tlb_gen to 2. The second is a partial flush and sets tlb_gen to 3. If the second flush gets propagated to a given CPU first and it were to do an actual partial flush (INVLPG) and set the percpu tlb_gen to 3, then the first flush won't do anything and we'll fail to flush all the pages we need to flush. My solution was to say that we're only allowed to do INVLPG if we're making exactly the same change to the local tlb_gen that the requester made to context.tlb_gen. I'll add a comment to this effect. > >> + /* Partial flush */ >> unsigned long addr; >> unsigned long nr_pages = (f->end - f->start) >> PAGE_SHIFT; > > <---- newline here. Yup.