Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2461728pxj; Sat, 19 Jun 2021 13:05:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx9H5vq5EMpy06EoNaW8jtOcURlFpOQz/8Thd4LA+8Fuvbzg2uFpevVHu5LRIznGv0d0Qk5 X-Received: by 2002:a02:cc3b:: with SMTP id o27mr9666191jap.84.1624133141171; Sat, 19 Jun 2021 13:05:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624133141; cv=none; d=google.com; s=arc-20160816; b=U9QtSqik+63TTdT7fPf2KPibGMuEvrdV3oYxipRgvggBlxio8ALrvUD3dhikivGh6C +sLEJpgiFn2dtieLfiSiZ+a1nPHN9xp1M5GUFWuOu+0fTfmkhY6i3Lqfute8APjOaXPE SKhkh/51CTi/J2tvsr5RNdEMslxiSX98ifMg85D5kbZdpe9mGOJ3mV7OkfSZ+/dyhlJS 9OLs+vg+5rt1N5x1tOgpFIsVQeimltcm4Innlp94oCfpHe7YdhO6v0VQo3OV0Ce01pF3 b+eAktzgkDumt3YZkQFHLiTtUKm6UM00JfQX4zDFcwKwX1ro4xbSZvwjBg/5Ne07VeFV d1mQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:message-id :mime-version:in-reply-to:references:cc:to:subject:from:date :dkim-signature; bh=OwhWjl0UY0+FNBxouWMwlE3WFKcnUaBF/uAMsbRoR+E=; b=xVZ7pnjz0/5V4kzgROR6rbyK8jwFhUEupYCXRWfFHuGBsKHCP48rfB7MLiPizO6gxR tOPFiDGE952SrqEK1FK9XeKfKw0BfH4xmIPwp1+nWkKz4P9cI2skA9Fcl3mb9nZkoisp fGmLY2oK11ullHNJCZm78/ObsWlTq2djC7WWx+pnvx8RwzmYEQtt2KrHAjNqIk6R6Zob UFWeQIimhAKg+uAfX9I5eukPiW7EC6pl2VF46Lz1hmkZRC2CmnVtfHDlPHgskpZ3TtAO dpRyL4tnyZnqO3yrfwY27q9QBzEtQ3fYpgglDiKCxBKI5UEPNvEnOUJvqshC8fgHa/Sm XHDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=remDXVRO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v6si7440087ilu.109.2021.06.19.13.05.28; Sat, 19 Jun 2021 13:05:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=remDXVRO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231855AbhFSE3u (ORCPT + 99 others); Sat, 19 Jun 2021 00:29:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229971AbhFSE3u (ORCPT ); Sat, 19 Jun 2021 00:29:50 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04323C061574 for ; Fri, 18 Jun 2021 21:27:38 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id e1so5695439plh.8 for ; Fri, 18 Jun 2021 21:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:subject:to:cc:references:in-reply-to:mime-version :message-id:content-transfer-encoding; bh=OwhWjl0UY0+FNBxouWMwlE3WFKcnUaBF/uAMsbRoR+E=; b=remDXVROwVbNWAZ4K94lJUWS+79cijw76JVLGC2MpS4SqU2OcvHu0KiQMz99rVHjRI B9v6dyvY7sFFALw+yOIuw0+MpVw9UW4i+zXMqj/0yFzbqLO7SN8Ok4XkKtwVeIwJEsGN OUSix6BvK5sYmNsHED3YqfqlUbeabWWzn9lK6zAzOWAnW/pOvLB63Ni8gNCQQ4ycZ1Jv PGLU6Jmf0pEjxOeh1nNLh4hFiGOZ3B0eG/exwUwKJUIxvNufJZN7yiNoQ5UruH4BcpwM 70v2LUbI8NwxpF7JL9xFbYjqsxzcUJfIbvpxHEV/rLD5p4dDXirfyDr6ZEI40n9fzmEk V9KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:subject:to:cc:references:in-reply-to :mime-version:message-id:content-transfer-encoding; bh=OwhWjl0UY0+FNBxouWMwlE3WFKcnUaBF/uAMsbRoR+E=; b=OcYoNgNt1BfHcsvtb9sIGHWyrN9Zo6M1m0S521/kMtsDhD5zz2yMCS4wd6qnWrnnEX GN3lsNoArUrZSaS+u+Q50tj2kp+LiNJMHiYGl9xhoDiHkBYaVjqycHx/2y0nkWJkp/Fr H2UTC0nrwKGK+BCrPJuU02eSQZYE5jCN0bguUuhOA0jtYs2mF1mo/J8TuNha5T21AatX pyaBSOfwDkcG80ObF0UIUZ7+fNGGDDqPpaSYeAmMt+FnkPIgHbRa0TL0Rz2Sj7Of6KUX ZCdVC1Yle7GOnRik6qiaQqiEgq07XqwgFAfHxybNYS0R78LwoC8Owxig+3Y0xFCMofDa 2p2g== X-Gm-Message-State: AOAM5312R/uA5oYmjzTpKJ1pmIxUsfS8uBer4qj8ri7mMH24QUDvyJG2 JpTiNR0hi1n2u4Lr42TQHjE= X-Received: by 2002:a17:90a:6b01:: with SMTP id v1mr14742171pjj.10.1624076858443; Fri, 18 Jun 2021 21:27:38 -0700 (PDT) Received: from localhost (60-242-147-73.tpgi.com.au. [60.242.147.73]) by smtp.gmail.com with ESMTPSA id d92sm9497486pjk.38.2021.06.18.21.27.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Jun 2021 21:27:38 -0700 (PDT) Date: Sat, 19 Jun 2021 14:27:33 +1000 From: Nicholas Piggin Subject: Re: [PATCH 4/8] membarrier: Make the post-switch-mm barrier explicit To: Andy Lutomirski , "Peter Zijlstra (Intel)" , Rik van Riel Cc: Andrew Morton , Dave Hansen , Linux Kernel Mailing List , linux-mm@kvack.org, Mathieu Desnoyers , "Paul E. McKenney" , the arch/x86 maintainers References: <1623816595.myt8wbkcar.astroid@bobo.none> <617cb897-58b1-8266-ecec-ef210832e927@kernel.org> <1623893358.bbty474jyy.astroid@bobo.none> <58b949fb-663e-4675-8592-25933a3e361c@www.fastmail.com> <1623911501.q97zemobmw.astroid@bobo.none> <5efaca70-35a0-1ce5-98ff-651a5f153a0a@kernel.org> <1624070824.uyhrzf8zc7.astroid@bobo.none> In-Reply-To: MIME-Version: 1.0 Message-Id: <1624076552.h05ftritk3.astroid@bobo.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Excerpts from Andy Lutomirski's message of June 19, 2021 1:20 pm: >=20 >=20 > On Fri, Jun 18, 2021, at 7:53 PM, Nicholas Piggin wrote: >> Excerpts from Andy Lutomirski's message of June 18, 2021 9:49 am: >> > On 6/16/21 11:51 PM, Nicholas Piggin wrote: >> >> Excerpts from Andy Lutomirski's message of June 17, 2021 3:32 pm: >> >>> On Wed, Jun 16, 2021, at 7:57 PM, Andy Lutomirski wrote: >> >>>> >> >>>> >> >>>> On Wed, Jun 16, 2021, at 6:37 PM, Nicholas Piggin wrote: >> >>>>> Excerpts from Andy Lutomirski's message of June 17, 2021 4:41 am: >> >>>>>> On 6/16/21 12:35 AM, Peter Zijlstra wrote: >> >>>>>>> On Wed, Jun 16, 2021 at 02:19:49PM +1000, Nicholas Piggin wrote: >> >>>>>>>> Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 p= m: >> >>>>>>>>> membarrier() needs a barrier after any CPU changes mm. There = is currently >> >>>>>>>>> a comment explaining why this barrier probably exists in all c= ases. This >> >>>>>>>>> is very fragile -- any change to the relevant parts of the sch= eduler >> >>>>>>>>> might get rid of these barriers, and it's not really clear to = me that >> >>>>>>>>> the barrier actually exists in all necessary cases. >> >>>>>>>> >> >>>>>>>> The comments and barriers in the mmdrop() hunks? I don't see wh= at is=20 >> >>>>>>>> fragile or maybe-buggy about this. The barrier definitely exist= s. >> >>>>>>>> >> >>>>>>>> And any change can change anything, that doesn't make it fragil= e. My >> >>>>>>>> lazy tlb refcounting change avoids the mmdrop in some cases, bu= t it >> >>>>>>>> replaces it with smp_mb for example. >> >>>>>>> >> >>>>>>> I'm with Nick again, on this. You're adding extra barriers for n= o >> >>>>>>> discernible reason, that's not generally encouraged, seeing how = extra >> >>>>>>> barriers is extra slow. >> >>>>>>> >> >>>>>>> Both mmdrop() itself, as well as the callsite have comments sayi= ng how >> >>>>>>> membarrier relies on the implied barrier, what's fragile about t= hat? >> >>>>>>> >> >>>>>> >> >>>>>> My real motivation is that mmgrab() and mmdrop() don't actually n= eed to >> >>>>>> be full barriers. The current implementation has them being full >> >>>>>> barriers, and the current implementation is quite slow. So let's= try >> >>>>>> that commit message again: >> >>>>>> >> >>>>>> membarrier() needs a barrier after any CPU changes mm. There is = currently >> >>>>>> a comment explaining why this barrier probably exists in all case= s. The >> >>>>>> logic is based on ensuring that the barrier exists on every contr= ol flow >> >>>>>> path through the scheduler. It also relies on mmgrab() and mmdro= p() being >> >>>>>> full barriers. >> >>>>>> >> >>>>>> mmgrab() and mmdrop() would be better if they were not full barri= ers. As a >> >>>>>> trivial optimization, mmgrab() could use a relaxed atomic and mmd= rop() >> >>>>>> could use a release on architectures that have these operations. >> >>>>> >> >>>>> I'm not against the idea, I've looked at something similar before = (not >> >>>>> for mmdrop but a different primitive). Also my lazy tlb shootdown = series=20 >> >>>>> could possibly take advantage of this, I might cherry pick it and = test=20 >> >>>>> performance :) >> >>>>> >> >>>>> I don't think it belongs in this series though. Should go together= with >> >>>>> something that takes advantage of it. >> >>>> >> >>>> I=E2=80=99m going to see if I can get hazard pointers into shape qu= ickly. >> >>> >> >>> Here it is. Not even boot tested! >> >>> >> >>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commi= t/?h=3Dsched/lazymm&id=3Decc3992c36cb88087df9c537e2326efb51c95e31 >> >>> >> >>> Nick, I think you can accomplish much the same thing as your patch b= y: >> >>> >> >>> #define for_each_possible_lazymm_cpu while (false) >> >>=20 >> >> I'm not sure what you mean? For powerpc, other CPUs can be using the = mm=20 >> >> as lazy at this point. I must be missing something. >> >=20 >> > What I mean is: if you want to shoot down lazies instead of doing the >> > hazard pointer trick to track them, you could do: >> >=20 >> > #define for_each_possible_lazymm_cpu while (false) >> >=20 >> > which would promise to the core code that you don't have any lazies le= ft >> > by the time exit_mmap() is done. You might need a new hook in >> > exit_mmap() depending on exactly how you implement the lazy shootdown. >>=20 >> Oh for configuring it away entirely. I'll have to see how it falls out,=20 >> I suspect we'd want to just no-op that entire function and avoid the 2=20 >> atomics if we are taking care of our lazy mms with shootdowns. >=20 > Do you mean the smp_store_release()? On x86 and similar architectures, t= hat=E2=80=99s almost free. I=E2=80=99m also not convinced it needs to be a= real release. Probably the shoot lazies code would complile that stuff out entirely so not that as such, but the entire thing including the change to the=20 membarrier barrier (which as I said, shoot lazies could possibly take=20 advantage of anyway). My point is I haven't seen how everything goes together or looked at=20 generated code so I can't exactly say yes to your question, but that there's no reason it couldn't be made to nicely fold away based on config option so I'm not too concerned about that issue. Thanks, Nick