Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp561651imm; Wed, 22 Aug 2018 08:47:29 -0700 (PDT) X-Google-Smtp-Source: ANB0Vda2hlZN+io1SM2d3VmxknhE/XoKIMqIDxLgcciCoTsiFXcYmfNucklXSmglQdmyaItYBMO9 X-Received: by 2002:a62:4e56:: with SMTP id c83-v6mr3697595pfb.240.1534952849270; Wed, 22 Aug 2018 08:47:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534952849; cv=none; d=google.com; s=arc-20160816; b=CkLlaOFpeGQpQvbzgrdZCzkxcDyV0vwx0/8Y9igs0q0VZfkQctyY8jMmIq4/VPN4Ll otkwvzaC29Y5hZhVVJnSb7Xr5qD7u0c43ndb874cXi3Qzl3oJuaQADSHFuNLT8AivJUl yr/a9EAOg8WOYTY3Qf7K/LHbPp4x1qE7Eh2G7JTPi5DUSDXMxcnFW57eDw2QPpaHTE8c nKF3ziaj4n4Mh0u08ZL/vWs81jNIW3tUG0rE0yxHNsl4ybV25+0kKofaIWrs4pAsAkYr 2aS5V5kVPHg01r/cLiQFTo2n+tiv+MQqsfAY5XuPnjoMbKH7Y5Pry/Y1fOJe3iJExkio 43uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:subject:cc:to :from:date:user-agent:message-id:dkim-signature :arc-authentication-results; bh=zUKxfxjpgWzZcjbYNf+Wl4oxw1QTfZ/a5hMhTBA6A2c=; b=SqNoyxFqyC9LC4/JGmicVge4CnflFPBaZ5dVBLlbwknvqvn7mLhjccNESYSe9NWHSQ rrN9zkdo7laP7+KVkxUOs8UbfIpA8ujiDRsibuPl/PUZosiX0cVPKUBpqaM9up4wxVSG xqQdnYKgJLe++M/2RyvpVRn2D8lEumbOFC5xumZVlwDdPNh7WJ9gGlx8sL11RfEKKRtf YLDwTiqqLEE+xwKiU1edQ1jkETjssqF2RvkqHvtq9wYBYGnxC4FXPu4NQyANoD+V2dbq /A7A/egFmYs00+sP7ATcbfhD3xFulOFhgAWML0uCL/Gb/qsiBnYmlm06NaAaE99pWwXW /lnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=LihpBYe8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q5-v6si2007090pgk.167.2018.08.22.08.47.14; Wed, 22 Aug 2018 08:47:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=LihpBYe8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729378AbeHVTL1 (ORCPT + 99 others); Wed, 22 Aug 2018 15:11:27 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:60026 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728568AbeHVTL0 (ORCPT ); Wed, 22 Aug 2018 15:11:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=zUKxfxjpgWzZcjbYNf+Wl4oxw1QTfZ/a5hMhTBA6A2c=; b=LihpBYe80MmgqB361ej9IoCREs /vL0G5e1PRqx3qNsuISx1zrXxTXiwALJu3yr//2iVpu6gFCuZk0FVc5JqI/LLGa0AnyCNUDhaLQT3 VkkNNdswlRc8kYqTusppEvH8gakOYOTvLrxneAyZVTu5X8U5hicLVNSp6ElFMGr6Mmtls06D+ybiy K2Fd6Y9uFC2CpTlckZzCAtAZUZXkodQz/hQLG3ZaJfvgpgP7QCaGi6ijjI9l9wj3Op5XZHFZVQ1/P S1k9FIfD/VLv6KZSJZ8bANA4+yQ3V4DYguO8M/Cv+kgBe5Xd0O5/v2CqEFbFcZnfX9QoVDAB28b+G fk0zJHHw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fsVKM-0004RB-E1; Wed, 22 Aug 2018 15:45:46 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id C2A5C2024D725; Wed, 22 Aug 2018 17:45:44 +0200 (CEST) Message-ID: <20180822154046.823850812@infradead.org> User-Agent: quilt/0.65 Date: Wed, 22 Aug 2018 17:30:15 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: peterz@infradead.org, luto@kernel.org, x86@kernel.org, bp@alien8.de, will.deacon@arm.com, riel@surriel.com, jannh@google.com, ascannell@google.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Nicholas Piggin , David Miller , Martin Schwidefsky , Michael Ellerman Subject: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE References: <20180822153012.173508681@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jann reported that x86 was missing required TLB invalidates when he hit the !*batch slow path in tlb_remove_table(). This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache) invalidates, the PowerPC-hash where this code originated and the Sparc-hash where this was subsequently used did not need that. ARM which later used this put an explicit TLB invalidate in their __p*_free_tlb() functions, and PowerPC-radix followed that example. But when we hooked up x86 we failed to consider this. Fix this by (optionally) hooking tlb_remove_table() into the TLB invalidate code. NOTE: s390 was also needing something like this and might now be able to use the generic code again. Cc: stable@kernel.org Cc: Nicholas Piggin Cc: David Miller Cc: Will Deacon Cc: Martin Schwidefsky Cc: Michael Ellerman Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by: Jann Horn Signed-off-by: Peter Zijlstra (Intel) --- arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + mm/memory.c | 27 +++++++++++++++++++++++++-- 3 files changed, 29 insertions(+), 2 deletions(-) --- a/arch/Kconfig +++ b/arch/Kconfig @@ -362,6 +362,9 @@ config HAVE_ARCH_JUMP_LABEL config HAVE_RCU_TABLE_FREE bool +config HAVE_RCU_TABLE_INVALIDATE + bool + config ARCH_HAVE_NMI_SAFE_CMPXCHG bool --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -180,6 +180,7 @@ config X86 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP select HAVE_RCU_TABLE_FREE + select HAVE_RCU_TABLE_INVALIDATE if HAVE_RCU_TABLE_FREE select HAVE_REGS_AND_STACK_ACCESS_API select HAVE_RELIABLE_STACKTRACE if X86_64 && (UNWINDER_FRAME_POINTER || UNWINDER_ORC) && STACK_VALIDATION select HAVE_STACKPROTECTOR if CC_HAS_SANE_STACKPROTECTOR --- a/mm/memory.c +++ b/mm/memory.c @@ -238,17 +238,22 @@ void arch_tlb_gather_mmu(struct mmu_gath __tlb_reset_range(tlb); } -static void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) +static void __tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) { if (!tlb->end) return; tlb_flush(tlb); mmu_notifier_invalidate_range(tlb->mm, tlb->start, tlb->end); + __tlb_reset_range(tlb); +} + +static void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) +{ + __tlb_flush_mmu_tlbonly(tlb); #ifdef CONFIG_HAVE_RCU_TABLE_FREE tlb_table_flush(tlb); #endif - __tlb_reset_range(tlb); } static void tlb_flush_mmu_free(struct mmu_gather *tlb) @@ -330,6 +335,21 @@ bool __tlb_remove_page_size(struct mmu_g * See the comment near struct mmu_table_batch. */ +/* + * If we want tlb_remove_table() to imply TLB invalidates. + */ +static inline void tlb_table_invalidate(struct mmu_gather *tlb) +{ +#ifdef CONFIG_HAVE_RCU_TABLE_INVALIDATE + /* + * Invalidate page-table caches used by hardware walkers. Then we still + * need to RCU-sched wait while freeing the pages because software + * walkers can still be in-flight. + */ + __tlb_flush_mmu_tlbonly(tlb); +#endif +} + static void tlb_remove_table_smp_sync(void *arg) { /* Simply deliver the interrupt */ @@ -366,6 +386,7 @@ void tlb_table_flush(struct mmu_gather * struct mmu_table_batch **batch = &tlb->batch; if (*batch) { + tlb_table_invalidate(tlb); call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu); *batch = NULL; } @@ -378,11 +399,13 @@ void tlb_remove_table(struct mmu_gather if (*batch == NULL) { *batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN); if (*batch == NULL) { + tlb_table_invalidate(tlb); tlb_remove_table_one(table); return; } (*batch)->nr = 0; } + (*batch)->tables[(*batch)->nr++] = table; if ((*batch)->nr == MAX_TABLE_BATCH) tlb_table_flush(tlb);