Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1678384pxj; Fri, 18 Jun 2021 12:25:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwxerAGm+aVX31lN4CumgKgkWJHHCnhqHkRZ5D3PY7RgKUCaWuF/sj1qJvrzdctN/ciDbTg X-Received: by 2002:aa7:cd9a:: with SMTP id x26mr7134059edv.185.1624044346697; Fri, 18 Jun 2021 12:25:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624044346; cv=none; d=google.com; s=arc-20160816; b=gZXNDwPoqVHipk4GwS8n8YRvckQ80/75JKjSDYEU8XuCyE4Ud/qt2ZN2GF/B0HkM26 cbHDZQ3ZFbQpz4OyCl5Sfd5Dnq2jx8eDsiI+APun+tmfY8PjlFgDR2LSlsgepgQUxMnU cJGcCcVyi9uVrTsOqmJ2GMHAMLPaIYYl92ToZFfd4BUTWbkvw+kDbrFp+NTM8pTa9qW7 sJVyZkFqQtkyhlodm7bkZijfSa4+IptfgfUjVyvZjx+LV3BYZbqDOj6T/4VPJO+slrjM Ip0tHGBQ/gaux2/Ow35QWwhtRGttVjOjzkBgv1KqTutPYDmZ7R7JxpnNb4cnewQ7Q01r mhxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=lnDMsF7Ivi/eFOJlmyFZgbZ9CVtyPvVD2VrEPSqixVk=; b=ZoMqXBpygBsyTe9xkyhoL9BLavpXXfGpI9RXx+iQTKUL6C29g2h+j6iQyqRsjl3BbR BrU8YCfaYyge/gsizwwNIokHHPnhkz9banef7GUHATn2ofMAouqrklFzHOHvu0M7WLsd KuX5E4l8D/bdtTJYKNff246f/+VkSmMTwn+I5iN9yM+qvKbqckXwju4AD/MZzq5t4KZ0 Jc7vQan752/E0OVXLCbOdPEiu/lcGRx7ExF4kJScMRj6Gp3Q85IsXGYY6oe0ForEiTIs n/52Vi8vA72lKeY7WdXVhNy5xtJwQH8KOoNRfQB8rvLwYwdvcHOipGhpToZLi81NRlDr 8UJA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h14si9892461ede.514.2021.06.18.12.25.24; Fri, 18 Jun 2021 12:25:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236033AbhFRRQZ (ORCPT + 99 others); Fri, 18 Jun 2021 13:16:25 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:5361 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232598AbhFRRQW (ORCPT ); Fri, 18 Jun 2021 13:16:22 -0400 Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4G65BC2JzFzBF8j; Fri, 18 Jun 2021 19:14:11 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vFQ9Fjr9vo63; Fri, 18 Jun 2021 19:14:11 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4G65BC1F6JzBF8S; Fri, 18 Jun 2021 19:14:11 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id E461E8B84F; Fri, 18 Jun 2021 19:14:10 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id 2wpqGmPr9Fdb; Fri, 18 Jun 2021 19:14:10 +0200 (CEST) Received: from [192.168.4.90] (unknown [192.168.4.90]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 8CB7F8B84E; Fri, 18 Jun 2021 19:14:09 +0200 (CEST) Subject: Re: [PATCH for 4.16 v7 02/11] powerpc: membarrier: Skip memory barrier in switch_mm() To: Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Thomas Gleixner Cc: Maged Michael , Dave Watson , Will Deacon , Russell King , David Sehr , Paul Mackerras , "H . Peter Anvin" , linux-arch@vger.kernel.org, x86@kernel.org, Andrew Hunter , Greg Hackmann , Alan Stern , "Paul E . McKenney" , Andrea Parri , Avi Kivity , Boqun Feng , linuxppc-dev@lists.ozlabs.org, Nicholas Piggin , Alexander Viro , Andy Lutomirski , linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Linus Torvalds References: <20180129202020.8515-1-mathieu.desnoyers@efficios.com> <20180129202020.8515-3-mathieu.desnoyers@efficios.com> From: Christophe Leroy Message-ID: <8b200dd5-f37b-b208-82fb-2775df7bcd49@csgroup.eu> Date: Fri, 18 Jun 2021 19:13:59 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20180129202020.8515-3-mathieu.desnoyers@efficios.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 29/01/2018 à 21:20, Mathieu Desnoyers a écrit : > Allow PowerPC to skip the full memory barrier in switch_mm(), and > only issue the barrier when scheduling into a task belonging to a > process that has registered to use expedited private. > > Threads targeting the same VM but which belong to different thread > groups is a tricky case. It has a few consequences: > > It turns out that we cannot rely on get_nr_threads(p) to count the > number of threads using a VM. We can use > (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1) > instead to skip the synchronize_sched() for cases where the VM only has > a single user, and that user only has a single thread. > > It also turns out that we cannot use for_each_thread() to set > thread flags in all threads using a VM, as it only iterates on the > thread group. > > Therefore, test the membarrier state variable directly rather than > relying on thread flags. This means > membarrier_register_private_expedited() needs to set the > MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and > only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows > private expedited membarrier commands to succeed. > membarrier_arch_switch_mm() now tests for the > MEMBARRIER_STATE_PRIVATE_EXPEDITED flag. Looking at switch_mm_irqs_off(), I found it more complex than expected and found that this patch is the reason for that complexity. Before the patch (ie in kernel 4.14), we have: 00000000 : 0: 81 24 01 c8 lwz r9,456(r4) 4: 71 29 00 01 andi. r9,r9,1 8: 40 82 00 1c bne 24 c: 39 24 01 c8 addi r9,r4,456 10: 39 40 00 01 li r10,1 14: 7d 00 48 28 lwarx r8,0,r9 18: 7d 08 53 78 or r8,r8,r10 1c: 7d 00 49 2d stwcx. r8,0,r9 20: 40 c2 ff f4 bne- 14 24: 7c 04 18 40 cmplw r4,r3 28: 81 24 00 24 lwz r9,36(r4) 2c: 91 25 04 4c stw r9,1100(r5) 30: 4d 82 00 20 beqlr 34: 48 00 00 00 b 34 34: R_PPC_REL24 switch_mmu_context After the patch (ie in 5.13-rc6), that now is: 00000000 : 0: 81 24 02 18 lwz r9,536(r4) 4: 71 29 00 01 andi. r9,r9,1 8: 41 82 00 24 beq 2c c: 7c 04 18 40 cmplw r4,r3 10: 81 24 00 24 lwz r9,36(r4) 14: 91 25 04 d0 stw r9,1232(r5) 18: 4d 82 00 20 beqlr 1c: 81 24 00 28 lwz r9,40(r4) 20: 71 29 00 0a andi. r9,r9,10 24: 40 82 00 34 bne 58 28: 48 00 00 00 b 28 28: R_PPC_REL24 switch_mmu_context 2c: 39 24 02 18 addi r9,r4,536 30: 39 40 00 01 li r10,1 34: 7d 00 48 28 lwarx r8,0,r9 38: 7d 08 53 78 or r8,r8,r10 3c: 7d 00 49 2d stwcx. r8,0,r9 40: 40 a2 ff f4 bne 34 44: 7c 04 18 40 cmplw r4,r3 48: 81 24 00 24 lwz r9,36(r4) 4c: 91 25 04 d0 stw r9,1232(r5) 50: 4d 82 00 20 beqlr 54: 48 00 00 00 b 54 54: R_PPC_REL24 switch_mmu_context 58: 2c 03 00 00 cmpwi r3,0 5c: 41 82 ff cc beq 28 60: 48 00 00 00 b 60 60: R_PPC_REL24 switch_mmu_context Especially, the comparison of 'prev' to 0 is pointless as both cases end up with just branching to 'switch_mmu_context' I don't understand all that complexity to just replace a simple 'smp_mb__after_unlock_lock()'. #define smp_mb__after_unlock_lock() smp_mb() #define smp_mb() barrier() # define barrier() __asm__ __volatile__("": : :"memory") Am I missing some subtility ? Thanks Christophe