Received: by 10.223.176.5 with SMTP id f5csp3383056wra; Mon, 29 Jan 2018 12:24:08 -0800 (PST) X-Google-Smtp-Source: AH8x226opfLed9zycMsHcAN5b7ghjA2YpwW+9pZIJo0QrbVQq79bCuG3F8TDIGwTwuyIkONF3s56 X-Received: by 10.99.61.205 with SMTP id k196mr20830985pga.370.1517257448027; Mon, 29 Jan 2018 12:24:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517257447; cv=none; d=google.com; s=arc-20160816; b=yI97936wr7da1FvuoziX+0NQUqS4+shzTf66wdbENqGq75XeDxXrT4JLtCxpPZTZqI 3RfxXLRPyrcM8NqQOh2Jqvse0yO1EzyhFs7C5aAo6Dg1tgRJM7xSF6jgs1bRP49BLUG4 DBIrPRRGS8dbqTuRkwMlhq2PgZWMMv6Jyl8qTfvm1/Qt5PWxGyI0wrRjDv0T69l4k7Aj IiMvX6Kv8wcAze3mkjMhBIxd9MQ10zJIT/LQi19KNAylRG2nxUo+X0tWUsdL25hlr01E FK+zyy8xybH0hM/8yJy3cLuW3Hk829ZvEMHuth95FgX3LgXrsA+aY9OL4PHzhekVWLWe BwIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=IR09J3PiyHxLAz0EaTtfwyJompuzmsKTN0P9nJu6BUQ=; b=sj3oQH74vJiKsHkfdrrs8PlWx16RPIO9ZRYCjZTkLP1iHCrXndZ1IT1Go9DCidSsA/ eMMRllfQVl4hX1gXs8oRF4BMkOJPtKIYC47Nk5akSRFvNqdw/SBZr7WSI7l7QNWVi+HY xaIivrywuDwxNQFhnml0uc7+kZAF+odYPnkzv8qvRDQARM7TnzHrhS29NiOF+M/npODK hFLlvlUuMroDyfKi8+g7sarLS6yBLv0Iuqj72FyCwfvmAQkwfvz7ABtHj/Rv9R5yHegd OT29Zj1pkZaRgVLfAfJ+JNHCqeRWv5TEqiY74dNKpmfBGb9aM3KJFhe0zdURm3jCrdcK BnCw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c6-v6si3158462plz.683.2018.01.29.12.23.53; Mon, 29 Jan 2018 12:24:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754875AbeA2UWm (ORCPT + 99 others); Mon, 29 Jan 2018 15:22:42 -0500 Received: from mail.efficios.com ([167.114.142.141]:53345 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754604AbeA2UWe (ORCPT ); Mon, 29 Jan 2018 15:22:34 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 228AF340302; Mon, 29 Jan 2018 20:23:06 +0000 (UTC) Received: from mail.efficios.com ([127.0.0.1]) by localhost (evm-mail-1.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id uNwnG37ANYVE; Mon, 29 Jan 2018 20:22:53 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 10724340309; Mon, 29 Jan 2018 20:22:42 +0000 (UTC) X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (evm-mail-1.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id aJ37e29Qdet6; Mon, 29 Jan 2018 20:22:42 +0000 (UTC) Received: from thinkos.internal.efficios.com (192-222-157-41.qc.cable.ebox.net [192.222.157.41]) by mail.efficios.com (Postfix) with ESMTPSA id 973473402DE; Mon, 29 Jan 2018 20:22:41 +0000 (UTC) From: Mathieu Desnoyers To: Ingo Molnar , Peter Zijlstra , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Andy Lutomirski , "Paul E . McKenney" , Boqun Feng , Andrew Hunter , Maged Michael , Avi Kivity , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Dave Watson , "H . Peter Anvin" , Andrea Parri , Russell King , Greg Hackmann , Will Deacon , David Sehr , Linus Torvalds , x86@kernel.org, Mathieu Desnoyers , linux-arch@vger.kernel.org Subject: [PATCH for 4.16 v4 09/11] membarrier: x86: Provide core serializing command Date: Mon, 29 Jan 2018 15:20:18 -0500 Message-Id: <20180129202020.8515-10-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180129202020.8515-1-mathieu.desnoyers@efficios.com> References: <20180129202020.8515-1-mathieu.desnoyers@efficios.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There are two places where core serialization is needed by membarrier: 1) When returning from the membarrier IPI, 2) After scheduler updates curr to a thread with a different mm, before going back to user-space, since the curr->mm is used by membarrier to check whether it needs to send an IPI to that CPU. x86-32 uses iret as return from interrupt, and both iret and sysexit to go back to user-space. The iret instruction is core serializing, but not sysexit. x86-64 uses iret as return from interrupt, which takes care of the IPI. However, it can return to user-space through either sysretl (compat code), sysretq, or iret. Given that sysret{l,q} is not core serializing, we rely instead on write_cr3() performed by switch_mm() to provide core serialization after changing the current mm, and deal with the special case of kthread -> uthread (temporarily keeping current mm into active_mm) by adding a sync_core() in that specific case. Use the new sync_core_before_usermode() to guarantee this. Signed-off-by: Mathieu Desnoyers Acked-by: Peter Zijlstra (Intel) CC: Andy Lutomirski CC: Paul E. McKenney CC: Boqun Feng CC: Andrew Hunter CC: Maged Michael CC: Avi Kivity CC: Benjamin Herrenschmidt CC: Paul Mackerras CC: Michael Ellerman CC: Dave Watson CC: Thomas Gleixner CC: Ingo Molnar CC: "H. Peter Anvin" CC: Andrea Parri CC: Russell King CC: Greg Hackmann CC: Will Deacon CC: David Sehr CC: x86@kernel.org CC: linux-arch@vger.kernel.org --- Changes since v1: - Use the newly introduced sync_core_before_usermode(). Move all state handling to generic code. - Add linux/processor.h include to include/linux/sched/mm.h. Changes since v2: - Fix use-after-free in membarrier_mm_sync_core_before_usermode. Changes since v3: - Move generic code into separate patch. --- arch/x86/Kconfig | 1 + arch/x86/entry/entry_32.S | 5 +++++ arch/x86/entry/entry_64.S | 4 ++++ arch/x86/mm/tlb.c | 7 ++++--- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 0b44c8dd0e95..b5324f2e3162 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -54,6 +54,7 @@ config X86 select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_KCOV if X86_64 + select ARCH_HAS_MEMBARRIER_SYNC_CORE select ARCH_HAS_PMEM_API if X86_64 select ARCH_HAS_REFCOUNT select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64 diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index 60c4c342316c..267d747a867f 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -565,6 +565,11 @@ restore_all: .Lrestore_nocheck: RESTORE_REGS 4 # skip orig_eax/error_code .Lirq_return: + /* + * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization + * when returning from IPI handler and when returning from + * scheduler to user-space. + */ INTERRUPT_RETURN .section .fixup, "ax" diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index ff6f8022612c..52e20d37cdd3 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -803,6 +803,10 @@ GLOBAL(restore_regs_and_return_to_kernel) POP_EXTRA_REGS POP_C_REGS addq $8, %rsp /* skip regs->orig_ax */ + /* + * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on iret core serialization + * when returning from IPI handler. + */ INTERRUPT_RETURN ENTRY(native_iret) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 9fa7d2e0e15e..9b34121c8f05 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -229,9 +229,10 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, this_cpu_write(cpu_tlbstate.is_lazy, false); /* - * The membarrier system call requires a full memory barrier - * before returning to user-space, after storing to rq->curr. - * Writing to CR3 provides that full memory barrier. + * The membarrier system call requires a full memory barrier and + * core serialization before returning to user-space, after + * storing to rq->curr. Writing to CR3 provides that full + * memory barrier and core serializing instruction. */ if (real_prev == next) { VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != -- 2.11.0