Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp294088pxb; Wed, 27 Oct 2021 03:15:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6dNJ+CxYx/bJSXptp9+GZsP4aPgpAP4yVn+1f+bYtvbqWC/xt8Kf3iFOM93zKjJ0YFYEO X-Received: by 2002:a17:906:fcb7:: with SMTP id qw23mr9228786ejb.448.1635329744866; Wed, 27 Oct 2021 03:15:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635329744; cv=none; d=google.com; s=arc-20160816; b=GhR0F7M1pXIaHVFkFryV9Tub35qFggWxNI5WIrk7bygO9z3ciT5ZtsFS0LnGcGcel5 wTl94TNvPHTEvcemO6i/nuiqzo5XKR9RDPhZzTzBssKo70Kjwatz2m9Ui3ZdHDG1dF9C Kdbhc7O7Z4iOjSQSR1b7c8hzzZdh6bWR3FE0T2jZGsWsrjv16tvhOwAPU2+t+A482s/l beeo2tVlfLlJYtUjPbQOc1AujlUNJ9pf8ru6fybHOIgXYz8V1S8huCmdr7RdWRzQjkzQ EBydeho9fMVVYNCGZ265xWMMXioWeQB/IvmcnA9hV9DszrpvDtdEXG4Z+iRUs0zV23Et sWrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=QZfJ0JXZD4hGi2c/axrYXIwzz739J14LlWMbDQhGfy8=; b=erYA6n/FdY8KHcHKv46I95xCrKwY15UWTvpeV2gJoeg8K9uHvMfbpyQwSlze7OYswc f/vqqtbVrz7EVwAdfBq+hDFDcyuVFAOlkj97aKTpGL0MJtHhikCCbXjkP2aN5OtU0pQ+ 46jeTJYve7497n+QfmEDVKtCO2cDdaJQZ786FrjvHuEYtE3VvM5Zs1A3+Cnndofu3Bng BuQkLdObBOSekzej/pbYCA2cV1Rczkx47M3u9nphq14Xlv+jWtW+MfY3IjRBTxm/4533 TzgP66JvvIMPesBvsx42cucFyxhHb5bRb6r6B8vIQGCds5fTIEbuHCLWtdZy3SiVfomP sD1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="jh+U/f/5"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hc44si454349ejc.591.2021.10.27.03.15.21; Wed, 27 Oct 2021 03:15:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="jh+U/f/5"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239113AbhJZUKG (ORCPT + 99 others); Tue, 26 Oct 2021 16:10:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239090AbhJZUKF (ORCPT ); Tue, 26 Oct 2021 16:10:05 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E41F2C061570 for ; Tue, 26 Oct 2021 13:07:40 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id 187so495404pfc.10 for ; Tue, 26 Oct 2021 13:07:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=QZfJ0JXZD4hGi2c/axrYXIwzz739J14LlWMbDQhGfy8=; b=jh+U/f/503Pyc9kVPGIvebraJNvIizTIstIOIE1dxwjZueaEesfI9VYURJTCDK6Cu+ QeKBSrHGodLrSQy10WPqy1ST35WRGeTiipkH8j+zL9e/7KDudMKnonH7jKHNNJNv8v9m +cDAlXj2ynVqKOtnCfbf2CGVFtIvBMSCDHrd8p1DMeC6+vqvGinhRvXUHB/HyAYmgNyM sebiDRFeJDdTGmdTGuTExQ5VPSoqHUSqjEiTDslWxaIC8t948mvk4DJy+gLQ2sSH+pl+ nNxQv7QYUQMzFaqyXmd9X0B20CjiIdanh1GZfWgcV7UK4AmEF2N7DyEXjkBsR+kLwZvZ 9LaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=QZfJ0JXZD4hGi2c/axrYXIwzz739J14LlWMbDQhGfy8=; b=gZPzWHx7fJcprEtsaLgqDC9DVg4/XkB5QG2FsLkSUMadYWKgeCA/qGS5ZI0QNXxHp8 Anr7JSek5VT7/+uDM0/3UHTtwlV9BedN3aY4wirC38rfGnGbUp1lNDevk3/cS3SIEVcp BTueec0iWSR4qS2qMO0L1vu/m0VXpb7PaMboAWTn77BPP3sSE7cbpqBeu6h303ulXPMr PO566l/Mzn+41301UHovPKJCWlwdQPisI5h7d5XFMYcqzgaIavSxTmFncGcUTqBzxifz GMqfbx5aWccA0RAlHoMOXnrnG96uLGxXMMU05XEtZuMGBlWmWldtWUN9inm10Gwh31rr neWw== X-Gm-Message-State: AOAM532p/Ehvv/Cr0f/FNexwWv0FYB1/8xV3tw8AR6HQwy3dh4ZX073I fsHHwSxKD/vlC76EFsiruClw+qjimhk= X-Received: by 2002:a63:7706:: with SMTP id s6mr14182169pgc.184.1635278859806; Tue, 26 Oct 2021 13:07:39 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id g22sm3726123pfc.202.2021.10.26.13.07.38 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Oct 2021 13:07:39 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\)) Subject: Re: [PATCH v2 2/5] mm: avoid unnecessary flush on change_huge_pmd() From: Nadav Amit In-Reply-To: <4f604380-a52b-660c-af82-541dbd7652e4@intel.com> Date: Tue, 26 Oct 2021 13:07:37 -0700 Cc: Linux-MM , LKML , Andrea Arcangeli , Andrew Cooper , Andrew Morton , Andy Lutomirski , Dave Hansen , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , "x86@kernel.org" Content-Transfer-Encoding: quoted-printable Message-Id: <640A6374-A06B-4E20-BF5D-9A21CC85CB12@gmail.com> References: <20211021122112.592634-1-namit@vmware.com> <20211021122112.592634-3-namit@vmware.com> <29E7E8A4-C400-40A5-ACEC-F15C976DDEE0@gmail.com> <435f41f2-ffd4-0278-9f26-fbe2c2c7545c@intel.com> <8BC74789-FF33-403F-B5D7-19034CAC7EE6@gmail.com> <4f604380-a52b-660c-af82-541dbd7652e4@intel.com> To: Dave Hansen X-Mailer: Apple Mail (2.3654.120.0.1.13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Oct 26, 2021, at 12:40 PM, Dave Hansen = wrote: >=20 > On 10/26/21 12:06 PM, Nadav Amit wrote: >>=20 >> To make it very clear - consider the following scenario, in which >> a volatile pointer p is mapped using a certain PTE, which is RW >> (i.e., *p is writable): >>=20 >> CPU0 CPU1 >> ---- ---- >> x =3D *p >> [ PTE cached in TLB;=20 >> PTE is not dirty ] >> clear_pte(PTE) >> *p =3D x >> [ needs to set dirty ] >>=20 >> Note that there is no TLB flush in this scenario. The question >> is whether the write access to *p would succeed, setting the >> dirty bit on the clear, non-present entry. >>=20 >> I was under the impression that the hardware AD-assist would >> recheck the PTE atomically as it sets the dirty bit. But, as I >> said, I am not sure anymore whether this is defined architecturally >> (or at least would work in practice on all CPUs modulo the=20 >> Knights Landing thingy). >=20 > Practically, at "x=3D*p", he thing that gets cached in the TLB will > Dirty=3D0. At the "*p=3Dx", the CPU will decide it needs to do a = write, > find the Dirty=3D0 entry and will entirely discard it. In other = words, it > *acts* roughly like this: >=20 > x =3D *p =09 > INVLPG(p) > *p =3D x; >=20 > Where the INVLPG() and the "*p=3Dx" are atomic. So, there's no > _practical_ problem with your scenario. This specific behavior isn't > architectural as far as I know, though. >=20 > Although it's pretty much just academic, as for the architecture, are > you getting hung up on the difference between the description of = "Accessed": >=20 > Whenever the processor uses a paging-structure entry as part of > linear-address translation, it sets the accessed flag in that > entry >=20 > and "Dirty:" >=20 > Whenever there is a write to a linear address, the processor > sets the dirty flag (if it is not already set) in the paging- > structure entry... >=20 > Accessed says "as part of linear-address translation", which means = that > the address must have a translation. But, the "Dirty" section doesn't > say that. It talks about "a write to a linear address" but not = whether > there is a linear address *translation* involved. >=20 > If that's it, we could probably add a bit like: >=20 > In addition to setting the accessed flag, whenever there is a > write... >=20 > before the dirty rules in the SDM. >=20 > Or am I being dense and continuing to miss your point? :) I think this time you got my question right. I was thrown off by the SDM comment on RW permissions vs dirty that I mentioned before: "If software on one logical processor writes to a page while software on another logical processor concurrently clears the R/W flag in the paging-structure entry that maps the page, execution on some processors = may result in the entry=E2=80=99s dirty flag being set (due to the write on = the first logical processor) and the entry=E2=80=99s R/W flag being clear (due to = the update to the entry on the second logical processor).=E2=80=9D I did not pay enough attention to these small differences that you = mentioned between access and dirty this time (although I did notice them before). I do not think that the change that you offered to the SDM really = clarifies the situation. Setting the access flag is done as part of caching the = PTE in the TLB. The SDM change you propose does not clarify the atomicity of = the permission/PTE-validity check and dirty-bit setting or the fact the PTE = is invalidated if the dirty-bit needs to be set and is cached as clear [I = do not presume you would want the latter in the SDM, since it is an = implementation detail.] I just wonder how come the R/W-clearing and the P-clearing cause = concurrent dirty bit setting to behave differently. I am not a hardware guy, but I = would imagine they would be the same...