Received: by 2002:ab2:6203:0:b0:1f5:f2ab:c469 with SMTP id o3csp2838445lqt; Tue, 23 Apr 2024 03:15:27 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUnKZ/pfRHKMSSOqm7ENLYta4Et24Pla0rSDDBJDKrjSFjdAtuOYnpy85JjW8ydeOhVJszNxSgwsZs3RwGqs8cQbGKbGQUC3wsLKRxb/w== X-Google-Smtp-Source: AGHT+IGDC/V0pQl2/uQHVs0YR11prmxxTIZHvQgA7FPSXbpm6Y4n8/YakHtLhP2DR+sRTzU7wI9/ X-Received: by 2002:a17:902:c405:b0:1e4:6e70:25d8 with SMTP id k5-20020a170902c40500b001e46e7025d8mr15502402plk.13.1713867327663; Tue, 23 Apr 2024 03:15:27 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713867327; cv=pass; d=google.com; s=arc-20160816; b=K49KJ3/c7M4HMjXuNmvXafaQtm3tx5MQFsk6DeP4EerR6ujGhLmJFJBqrJES4tTBc4 GBXzNSb144GItkJ10xQevKT3AIViaziLSrrvwW8EeugMTIUpBeVwhWZYs7+czAf9fZw7 bQy2FMpZ8C0X/ttvZtjd8IgQRV4/1On0lhc0HrLPYfvKFf7qAdy3KvtHuY1nmu7B3as2 1VWcX8KQJbdIam3mBVn/+oXMNUlJtyacLnVsjQiRB+xfOZhn0hutOJ0S8FVknspjQmqJ iTg/rbg4YkLTTIswaql9tufhvr5DkcQoefohyjo0ZsO2ngluoUdxquELWtYInS2zoibp 18VQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=jCOTv5PWok+SwS0vI+T3xv03OiHQLwhY5N433YgGCCs=; fh=5ElpXvGotM0/udMhnvi0GjXf9RQ5OKrboCm54T8RE10=; b=oglNqoGq4+7aiC6uY6iSHl813Diw49QGNrX1GbiQ7G9eXt5K8EFpZHE+KMUBYQxzQr rZSzI+E+qpXOCbwD0oUxnYJl+tzQBOjbBBR5VQ0EK8Dyjsv2f3MhHcBsMxEsZijr/VA0 PkjxdhKRv1jtA4yjATdDk0Xfwf9H7WBeXJl3n+njUdN9D4RalqghlmnnajIlv7qvXp8g EW3Wp6AszDSHQu4K/hasxBwb9APOO/+9Uv7b/LMRbRLC1sgsjOQSQImbC+wn+buynZzk 07XTHqReU5zJljW//wWtvphEi/qc4aJN9cZEqAm7ZmilDpqslQ5P+zi6oTnipeD2F5yW 1A+w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-154846-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-154846-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id d12-20020a170902cecc00b001e7e051546csi9402454plg.465.2024.04.23.03.15.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Apr 2024 03:15:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-154846-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-154846-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-154846-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 6EB732826C9 for ; Tue, 23 Apr 2024 10:15:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 78F8060BB6; Tue, 23 Apr 2024 10:15:21 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 943B45FB9B for ; Tue, 23 Apr 2024 10:15:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713867321; cv=none; b=pRIa3dc4j+6KbD/1Xcqol0tCn2KXhkfA9WHB+bG9NImqdLLD2RVktEWdoTKAnOXD1Lgjl+S94Gzz72dMt2ifdj1POwluCt/BO2YFZ1hzNdO8zQUCB0zAxbNBViWqoXNkCBUlHaCySzzpQk8HHL3XnM5436T/XLtFd8O8oOjtpFA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713867321; c=relaxed/simple; bh=uCrJn8bOdsM7zhyNnPQ0SZMDjm0WhwqebBwccqw71g8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=rGZAfxHwx5KKmqb+ZtUIOOUX3Mgji5vumlxtf8BsG+XQE9aM++txcHI/K79jUIno50Hn/i6yFEY1q57ELaS6APPi5SRQxfU/TDn1OTDjsNGOWr3b3+4fd+iUqJdAyDVA9UmxZ7RL+2gK/l/fu9mQVhQp7+y0u0jxjXR4BuF8b2I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EF97D339; Tue, 23 Apr 2024 03:15:46 -0700 (PDT) Received: from [10.57.74.127] (unknown [10.57.74.127]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3C19C3F7BD; Tue, 23 Apr 2024 03:15:17 -0700 (PDT) Message-ID: <789cb7e4-8659-4244-b72e-e8fa0b26431d@arm.com> Date: Tue, 23 Apr 2024 11:15:15 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 0/4] Reduce cost of ptep_get_lockless on arm64 Content-Language: en-GB To: David Hildenbrand , Mark Rutland , Catalin Marinas , Will Deacon , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Andrew Morton , Muchun Song Cc: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240215121756.2734131-1-ryan.roberts@arm.com> <8bd9e136-8575-4c40-bae2-9b015d823916@redhat.com> <86680856-2532-495b-951a-ea7b2b93872f@arm.com> <35236bbf-3d9a-40e9-84b5-e10e10295c0c@redhat.com> <4fba71aa-8a63-4a27-8eaf-92a69b2cff0d@arm.com> <5a23518b-7974-4b03-bd6e-80ecf6c39484@redhat.com> <81aa23ca-18b1-4430-9ad1-00a2c5af8fc2@arm.com> <70a36403-aefd-4311-b612-84e602465689@redhat.com> <3e50030d-2289-4470-a727-a293baa21618@redhat.com> <772de69a-27fa-4d39-a75d-54600d767ad1@arm.com> <969dc6c3-2764-4a35-9fa6-7596832fb2a3@redhat.com> <11b1c25b-3e20-4acf-9be5-57b508266c5b@redhat.com> <89e04df9-6a2f-409c-ae7d-af1f91d0131e@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi David, Sorry for the slow reply on this; its was due to a combination of thinking a bit more about the options here and being out on holiday. On 15/04/2024 17:02, David Hildenbrand wrote: >>>> The potential problem I see with this is that the Arm ARM doesn't specify which >>>> PTE of a contpte block the HW stores a/d in. So the HW _could_ update them >>>> randomly and this could spuriously increase your check failure rate. In reality >>>> I believe most implementations will update the PTE for the address that caused >>>> the TLB to be populated. But in some cases, you could have eviction (due to >>>> pressure or explicit invalidation) followed by re-population due to faulting on >>>> a different page of the contpte block. In this case you would see this type of >>>> problem too. >>>> >>>> But ultimately, isn't this basically equivalent to ptep_get_lockless() >>>> returning >>>> potentially false-negatives for access and dirty? Just with a much higher >>>> chance >>>> of getting a false-negative. How is this helping? >>> >>> You are performing an atomic read like GUP-fast wants you to. So there are no >>> races to worry about like on other architectures: HW might *set* the dirty bit >>> concurrently, but that's just fine. >> >> But you can still see false-negatives for access and dirty... > > Yes. > >> >>> >>> The whole races you describe with concurrent folding/unfolding/ ... are >>> irrelevant. >> >> And I think I convinced myself that you will only see false-negatives with >> today's arm64 ptep_get(). But an order or magnitude fewer than with your >> proposal (assuming 16 ptes per contpte block, and the a/d bits are in one of >> those). >> >>> >>> To me that sounds ... much simpler ;) But again, just something I've been >>> thinking about. >> >> OK so this approach upgrades my "I'm fairly sure we never see false-positives" >> to "we definitely never see false-positives". But it certainly increases the >> quantity of false-negatives. > > Yes. > >> >>> >>> The reuse of pte_get_lockless() outside GUP code might not have been the wisest >>> choice. >>> >> >> If you want to go down the ptep_get_gup_fast() route, you've still got to be >> able to spec it, and I think it will land pretty close to my most recent stab at >> respec'ing ptep_get_lockless() a couple of replies up on this thread. >> >> Where would your proposal leave the KVM use case? If you call it >> ptep_get_gup_fast() presumably you wouldn't want to use it for KVM? So it would >> be left with ptep_get()... > > It's using GUP-fast. > >> >> Sorry this thread is getting so long. Just to summarise, I think there are >> currently 3 solutions on the table: >> >>    - ptep_get_lockless() remains as is >>    - ptep_get_lockless() wraps ptep_get() >>    - ptep_get_lockless() wraps __ptep_get() (and gets a gup_fast rename) >> >> Based on discussion so far, that's also the order of my preference. > > (1) seems like the easiest thing to do. Yes, I'm very much in favour of easy. > >> >> Perhaps its useful to enumerate why we dislike the current ptep_get_lockless()? > > Well, you sent that patch series with "that aims to reduce the cost and > complexity of ptep_get_lockless() for arm64". (2) and (3) would achieve that. :) Touche! I'd half forgotten that we were having this conversation in the context of this series! I guess your ptep_get_gup_fast() approach is very similar to ptep_get_lockless_norecency()... So we are back to the beginning :) But ultimately I've come to the conclusion that it is easy to reason about the current arm64 ptep_get_lockless() implementation and see that its correct. The other options both have their drawbacks. Yes, there is a loop in the current implementation that would be nice to get rid of, but I don't think it is really any worse than the cmpxchg loops we already have in other helpers. I'm not planning to persue this any further. Thanks for the useful discussion (as always).