Received: by 2002:ab2:1149:0:b0:1f3:1f8c:d0c6 with SMTP id z9csp3004024lqz; Wed, 3 Apr 2024 15:19:49 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVLh9ic/cfpZttIuWPDDBFeKkH+oUpDe08HTszEg5C6jenu1q3gdZU1DU2j1y8dPKjvfCBe7TGOBOnnbqodphXM1jxQzm82YrhbebWUNA== X-Google-Smtp-Source: AGHT+IGW00No9ohqVyqWSHu5yT/qibrHiNAmnKd7ua6i2Kzdi9Hr2Jzm/NwiIxrzjp8q0f4J1Kd9 X-Received: by 2002:adf:ce0e:0:b0:343:6f88:5e5 with SMTP id p14-20020adfce0e000000b003436f8805e5mr549624wrn.55.1712182788842; Wed, 03 Apr 2024 15:19:48 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712182788; cv=pass; d=google.com; s=arc-20160816; b=iJfxW/MEAu8dS5j9Tj+yh6oVu8pPmxHgj+kp3/v4LW/2wlGbdOUf2ANtFKEKlMw7OX brRrdRvVFtvNkm/V0ulSWIpC0fXDY5AUvJZNaiBFbqBLNFL8ZAe6RETLC7qicLMhf5UI ZR27Zgj3I9OSLudpKKFfW/LLRmRId5ZreheNhxXWp0dWICZFGs/5arfgP3JcWiWd758+ pD8QASe1ON7NXZ+gKPF18WXUWMc+h1vOHq+fkJq9ipyDynDrDknHkY0a5srNJ/5N0fA+ bJgC0Cse1CARajJWrHFo4zTxa4tKdZTyLQT3oSiJKQKJcmd+ICoMImgGJEG56gN2r+tD K3XA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=XiLGqLJDqj0RMgdQUg0B/EASB3VnDvIGEKW06gPzZqs=; fh=P3+E3oR2JUCDmGkPQ+AM2Vi3xj5q70/X/xOXIrCXPO4=; b=V9qTVdO/S0Ba9sti8aB4htwhhJF3PNqW7HwJQcj+V3a/X3XRSO6LJVTXiYwMHhQzyn VTR8JWmCgV394nxE86jEPHodM2h4sCCDxL11cuC+hJftSB8+2Ko/OWMn9vH3Yh9ev8SR ZdD8qmuitjx1hgZRjfecqsnj0M40l+UmFGM7mQwN0lrVr4kFH/Ts9IT7stybo27AICJJ 6ZH5chZqV3oyeCuXZNw7ZoMq7uBv0oiD99dqPyCgOfjbNVJlBRQ9XXxAdBXIcI6v6EyH qaiu0PUM60gPisbVenXtOMHbtcLNlydODuJhSo7xK6fIRwuE/qJzUdEo+gSmv2bg8I4O Rwvg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="j/2+Rkbd"; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-130673-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-130673-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id g15-20020a0564021ecf00b00568d18ba5bbsi7661222edg.191.2024.04.03.15.19.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Apr 2024 15:19:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-130673-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="j/2+Rkbd"; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-130673-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-130673-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 915701F25F3D for ; Wed, 3 Apr 2024 22:19:48 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2A561156987; Wed, 3 Apr 2024 22:19:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="j/2+Rkbd" Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D668F15689A for ; Wed, 3 Apr 2024 22:19:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712182779; cv=none; b=UMeQqki18lSHbAvHnLBuef9n8ur0tQBmLzY4sKa1Lb8q8WwlvepOlj0Ld01lHxB6wxMb6QfZXJF03hXMm/tbw450g2bQQQao6QVAb1yrJL6AtEWg5tqWKAK3b2Mnxho8OjEA+Z5ETW9TBaI6iwGb8DkfhfPIcTu3K4db4kTaiXU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712182779; c=relaxed/simple; bh=j3s/mxsLO5drWmnKf3JyGnmUwI+6z8nFIMCiaj3rlB0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=p2FFPJ41q1OaurSFDVx7ir4rX1O9xxvI0IwkfS8QmbH8wONKVrkgmvqzsUmWmNu28UJxSSVpoRfqxF3YjbxkDtnvlWPltZZMAJIH+KsesqAq/YLdv58A78y28RULxxTH5LR5NJ3npTd6Wc2lGwMf0A2Kj/2JIFxai5M/UAycyFY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=j/2+Rkbd; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-5d5a080baf1so240279a12.1 for ; Wed, 03 Apr 2024 15:19:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712182777; x=1712787577; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XiLGqLJDqj0RMgdQUg0B/EASB3VnDvIGEKW06gPzZqs=; b=j/2+RkbdVh6hbuLM9NJlSRTIzEgVwzeahSDKsfoNVyRtsYPj4uqShMBvDJHg2U38mO aOThjqR3it5Q07IyoYIFQekfDzL2GsB3I3N8A95aJG+sPnS7QV0k4kBV9Vg9aeHEbh70 sq5LDdPrv7sYZoj4pn38pgIPRRwgzwmVKaIe6dm4ZIxc51fpsIkJ8JszRDllesxqqJds XMbVcKH5rpEEGUbZkybhkiE233z6AWvJxvt0NQaxcmKg3/iEQJ8vRCUVYh6o0Y26aqlJ 2gUrEz09sEHXzsJFMRx7/ZfKA+2zLBSxPCyi9ls6K0VQFN6VmBJiz+09UY+o20fp7hek 8akQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712182777; x=1712787577; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XiLGqLJDqj0RMgdQUg0B/EASB3VnDvIGEKW06gPzZqs=; b=iOq6HP/TYhllrXywNNg6pzNT3Qi/YekgOYt1v5qGWdIuHQHqp46rrpNo6NePgL5Kvu /GjBdz/VNvr4mfh2IsCLSpFlW/6TWP9MaqqYohFE1/xhCZaoxS2WOAHgU+Q/LOllkHaM KWSVw5NatErHXCz8yj/3WRtL1/rlNyiKnwN4joILUFDiI2/KiI6T6ue1A3MuwaUqzRu7 fdoHNv8DKVr+mcR23z0bVDD7fN8OE8lPAQ9nY968JKFkK/0Kdt2NhCyh4O/pL5GyIuLn ZvLO+GcD2SWI0YYRD0/EHwa/iKFr1/YviMzw2Jp0tuI3+9NA6CdIlmJe3bY7uf0xxlHl +6nA== X-Forwarded-Encrypted: i=1; AJvYcCX/ZFY8Atp3zgTRiEkgij0BIqs6bX7h3wtPcxGD9TAtnjAeKoKMttTPHZf+Qc+ntBAiZ8ZJp78hcvmGYXKcQdpg2oaUjry0K4hGXTsp X-Gm-Message-State: AOJu0YzpKMBAv6DpO2GhOEH/5irdz+cVjblMbBXCzmqnvICJ7aVu4icX t712YM+hyBKNn3TGcZIW99qVQtTJATdAFc+uPQONqp/UrotPbBrz1jWbsbiiydJk/1bvMD/Si8O kFw== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:d60a:0:b0:5f3:e755:d832 with SMTP id q10-20020a63d60a000000b005f3e755d832mr1009pgg.7.1712182777237; Wed, 03 Apr 2024 15:19:37 -0700 (PDT) Date: Wed, 3 Apr 2024 15:19:36 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240320005024.3216282-1-seanjc@google.com> <4d04b010-98f3-4eae-b320-a7dd6104b0bf@redhat.com> Message-ID: Subject: Re: [RFC PATCH 0/4] KVM: x86/mmu: Rework marking folios dirty/accessed From: Sean Christopherson To: David Hildenbrand Cc: David Matlack , Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, David Stevens , Matthew Wilcox Content-Type: text/plain; charset="us-ascii" On Wed, Apr 03, 2024, David Hildenbrand wrote: > On 03.04.24 02:17, Sean Christopherson wrote: > > On Tue, Apr 02, 2024, David Hildenbrand wrote: > > Aha! But try_to_unmap_one() also checks that refcount==mapcount+1, i.e. will > > also keep the folio if it has been GUP'd. And __remove_mapping() explicitly states > > that it needs to play nice with a GUP'd page being marked dirty before the > > reference is dropped. > > > > > * Must be careful with the order of the tests. When someone has > > * a ref to the folio, it may be possible that they dirty it then > > * drop the reference. So if the dirty flag is tested before the > > * refcount here, then the following race may occur: > > > > So while it's totally possible for KVM to get a W=1,D=0 PTE, if I'm reading the > > code correctly it's safe/legal so long as KVM either (a) marks the folio dirty > > while holding a reference or (b) marks the folio dirty before returning from its > > mmu_notifier_invalidate_range_start() hook, *AND* obviously if KVM drops its > > mappings in response to mmu_notifier_invalidate_range_start(). > > > > Yes, I agree that it should work in the context of vmscan. But (b) is > certainly a bit harder to swallow than "ordinary" (a) :) Heh, all the more reason to switch KVM x86 from (b) => (a). > As raised, if having a writable SPTE would imply having a writable+dirty > PTE, then KVM MMU code wouldn't have to worry about syncing any dirty bits > ever back to core-mm, so patch #2 would not be required. ... well, it would > be replaces by an MMU notifier that notifies about clearing the PTE dirty > bit :) Hmm, we essentially already have an mmu_notifier today, since secondary MMUs need to be invalidated before consuming dirty status. Isn't the end result essentially a sane FOLL_TOUCH? > ... because, then, there is also a subtle difference between > folio_set_dirty() and folio_mark_dirty(), and I am still confused about the > difference and not competent enough to explain the difference ... and KVM > always does the former, while zapping code of pagecache folios does the > latter ... hm Ugh, just when I thought I finally had my head wrapped around this. > Related note: IIRC, we usually expect most anon folios to be dirty. > > kvm_set_pfn_dirty()->kvm_set_page_dirty() does an unconditional > SetPageDirty()->folio_set_dirty(). Doing a test-before-set might frequently > avoid atomic ops. Noted, definitely worth poking at.