Received: by 2002:a05:7412:3b8b:b0:fc:a2b0:25d7 with SMTP id nd11csp3122266rdb; Tue, 13 Feb 2024 07:30:42 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCVb8P010UFwQvm/4iQd9oMCtUYa3lUAD4WygDCY3JlwUzNgHt8Ws+3ZFClByzbSMpSDKdCs+Y+kbR7vo4E/EsMdYwgtPVH2WJ9WRVfkaA== X-Google-Smtp-Source: AGHT+IGMxWQorht7hS3rDhx2DnGPUEbpTPkM0bpAPDpFKSaGd+psug6AcqrzdBmscts+L6DDcm4Y X-Received: by 2002:a17:90a:aa85:b0:298:9adb:1e80 with SMTP id l5-20020a17090aaa8500b002989adb1e80mr3010970pjq.21.1707838241906; Tue, 13 Feb 2024 07:30:41 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCXwlNOcDd5ymawyruRB7hEmEoUKUjAwMoSN7+DT3T6XUeyLgoS4Uh9FJ+HYNmluXncgLnKG2o0CAGIp4CSY05zYXAlo1JJN+49HUSrbjg== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id v15-20020a17090ac90f00b00298ba8d452csi695969pjt.135.2024.02.13.07.30.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Feb 2024 07:30:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-63803-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-63803-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-63803-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 109472832EF for ; Tue, 13 Feb 2024 15:29:41 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8D4035DF3C; Tue, 13 Feb 2024 15:29:34 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 739405DF0B for ; Tue, 13 Feb 2024 15:29:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707838174; cv=none; b=FGEu8aiji1Z5rDIkwMyULCGp88+JZYEMbQoO7QtW4f6uoWtfQT8E+AvURVEX6143IC2MK2t+6fxG58+0Xzww6xF6TgWTpW5HJoJ6c9imAhySZnuhXvrGSqH/UzyCP/aRURmxh0PwNkZzxq6zwLu+TQaR5tymIhNTOaq1kL9Z4Qk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707838174; c=relaxed/simple; bh=p93QLxYMw2eMpxXpo9DyjtVLHNaR0wFglfAKDKchBfs=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=hFPO59FenauaJzVQZpAaPEKChHfXFgCnooZf/bg6CxBaxeoEsphTBosPySV0R+BEZlCniRzOV4sb70WvVVy98LfXxhuj3wFeG0sK9LKCVWHEOYPN6YVq2Bol6P1RM4VBvcMhuE1Y328++oNVBl0D+x4D509Ft9FqelyTtadGqxc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 19912DA7; Tue, 13 Feb 2024 07:30:11 -0800 (PST) Received: from [10.1.36.184] (XHFQ2J9959.cambridge.arm.com [10.1.36.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 37F103F7B4; Tue, 13 Feb 2024 07:29:26 -0800 (PST) Message-ID: Date: Tue, 13 Feb 2024 15:29:24 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings Content-Language: en-GB To: David Hildenbrand , Mark Rutland Cc: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Nicholas Piggin , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240202080756.1453939-1-ryan.roberts@arm.com> <20240202080756.1453939-20-ryan.roberts@arm.com> <502a3ea7-fd86-4314-8292-c7999eda92eb@arm.com> <427ba87a-7dd0-4f3e-861f-fe6946b7cd97@redhat.com> <55a1e0ef-14b3-4311-b2aa-a6add76fa2ed@redhat.com> From: Ryan Roberts In-Reply-To: <55a1e0ef-14b3-4311-b2aa-a6add76fa2ed@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 12/02/2024 16:24, David Hildenbrand wrote: > On 12.02.24 16:34, Ryan Roberts wrote: >> On 12/02/2024 15:26, David Hildenbrand wrote: >>> On 12.02.24 15:45, Ryan Roberts wrote: >>>> On 12/02/2024 13:54, David Hildenbrand wrote: >>>>>>> If so, I wonder if we could instead do that comparison modulo the >>>>>>> access/dirty >>>>>>> bits, >>>>>> >>>>>> I think that would work - but will need to think a bit more on it. >>>>>> >>>>>>> and leave ptep_get_lockless() only reading a single entry? >>>>>> >>>>>> I think we will need to do something a bit less fragile. ptep_get() does >>>>>> collect >>>>>> the access/dirty bits so its confusing if ptep_get_lockless() doesn't >>>>>> IMHO. So >>>>>> we will likely want to rename the function and make its documentation >>>>>> explicit >>>>>> that it does not return those bits. >>>>>> >>>>>> ptep_get_lockless_noyoungdirty()? yuk... Any ideas? >>>>>> >>>>>> Of course if I could convince you the current implementation is safe, I >>>>>> might be >>>>>> able to sidestep this optimization until a later date? >>>>> >>>>> As discussed (and pointed out abive), there might be quite some callsites >>>>> where >>>>> we don't really care about uptodate accessed/dirty bits -- where ptep_get() is >>>>> used nowadays. >>>>> >>>>> One way to approach that I had in mind was having an explicit interface: >>>>> >>>>> ptep_get() >>>>> ptep_get_uptodate() >>>>> ptep_get_lockless() >>>>> ptep_get_lockless_uptodate() >>>> >>>> Yes, I like the direction of this. I guess we anticipate that call sites >>>> requiring the "_uptodate" variant will be the minority so it makes sense to use >>>> the current names for the "_not_uptodate" variants? But to do a slow migration, >>>> it might be better/safer to have the weaker variant use the new name - that >>>> would allow us to downgrade one at a time? >>> >>> Yes, I was primarily struggling with names. Likely it makes sense to either have >>> two completely new function names, or use the new name only for the "faster but >>> less precise" variant. >>> >>>> >>>>> >>>>> Especially the last one might not be needed. >>>> I've done a scan through the code and agree with Mark's original conclusions. >>>> Additionally, huge_pte_alloc() (which isn't used for arm64) doesn't rely on >>>> access/dirty info. So I think I could migrate everything to the weaker variant >>>> fairly easily. >>>> >>>>> >>>>> Futher, "uptodate" might not be the best choice because of PageUptodate() and >>>>> friends. But it's better than "youngdirty"/"noyoungdirty" IMHO. >>>> >>>> Certainly agree with "noyoungdirty" being a horrible name. How about "_sync" / >>>> "_nosync"? >>> >>> I could live with >>> >>> ptep_get_sync() >>> ptep_get_nosync() >>> >>> with proper documentation :) >> >> but could you live with: >> >> ptep_get() >> ptep_get_nosync() >> ptep_get_lockless_nosync() >> >> ? >> >> So leave the "slower, more precise" version with the existing name. > > Sure. > I'm just implementing this (as a separate RFC), and had an alternative idea for naming/semantics: ptep_get() ptep_get_norecency() ptep_get_lockless() ptep_get_lockless_norecency() The "_norecency" versions explicitly clear the access/dirty bits. This is useful for the "compare to original pte to check we are not racing" pattern: pte = ptep_get_lockless_norecency(ptep) .. if (!pte_same(pte, ptep_get_norecency(ptep))) // RACE! .. With the "_nosync" semantic, the access/dirty bits may or may not be set, so the user has to explicitly clear them to do the comparison. (although I considered a pte_same_nosync() that would clear the bits for you - but that name is pretty naff). Although the _norecency semantic requires always explicitly clearing the bits, so may be infinitesimally slower, it gives a very clear expectation that the access/dirty bits are always clear and I think that's conveyed well in the name too. Thoughts?