Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752937AbdFUPQC (ORCPT ); Wed, 21 Jun 2017 11:16:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:35978 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751174AbdFUPQB (ORCPT ); Wed, 21 Jun 2017 11:16:01 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 679EA21707 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org MIME-Version: 1.0 In-Reply-To: <20170621084902.vy7nvkon4krc7v3q@pd.tnic> References: <20170621084902.vy7nvkon4krc7v3q@pd.tnic> From: Andy Lutomirski Date: Wed, 21 Jun 2017 08:15:38 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3 01/11] x86/mm: Don't reenter flush_tlb_func_common() To: Borislav Petkov Cc: Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , Linus Torvalds , Andrew Morton , Mel Gorman , "linux-mm@kvack.org" , Nadav Amit , Rik van Riel , Dave Hansen , Arjan van de Ven , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1262 Lines: 32 On Wed, Jun 21, 2017 at 1:49 AM, Borislav Petkov wrote: > On Tue, Jun 20, 2017 at 10:22:07PM -0700, Andy Lutomirski wrote: >> It was historically possible to have two concurrent TLB flushes >> targetting the same CPU: one initiated locally and one initiated >> remotely. This can now cause an OOPS in leave_mm() at >> arch/x86/mm/tlb.c:47: >> >> if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK) >> BUG(); >> >> with this call trace: >> flush_tlb_func_local arch/x86/mm/tlb.c:239 [inline] >> flush_tlb_mm_range+0x26d/0x370 arch/x86/mm/tlb.c:317 > > These line numbers would most likely mean nothing soon. I think you > should rather explain why the bug can happen so that future lookers at > that code can find the spot... > That's why I gave function names and the actual code :) > I'm assuming this is going away in a future patch, as disabling IRQs > around a TLB flush is kinda expensive. I guess I'll see if I continue > reading... No, it's still there. It's possible that it could be removed with lots of care, but I'm not convinced it's worth it. local_irq_disable() and local_irq_enable() are fast, though (3 cycles each last time I benchmarked them?) -- it's local_irq_save() that really hurts. --Andy