Received: by 2002:a05:6358:795:b0:dc:4c66:fc3e with SMTP id n21csp854253rwj; Sat, 29 Oct 2022 12:03:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5/lyKYjgxWYqXpoBTwmkKrtLWUK61NpyKJ91nlRMfVJ+JLKi34uPtgkQmpk+sTPJEI/cGu X-Received: by 2002:a17:90a:ad83:b0:213:9c65:c2b5 with SMTP id s3-20020a17090aad8300b002139c65c2b5mr7930540pjq.137.1667070216459; Sat, 29 Oct 2022 12:03:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667070216; cv=none; d=google.com; s=arc-20160816; b=n4AO0MiS2yfUHfxg1R+8sdb1HxBqgkL5kdcfUud4WFp1pG2+Rk2mPArxDx+t8hNdXs awGSMEcx5/O+AxWbmkvgPwVVUQRPyc5gI7PvmG7ee9z9RepiHyWMKef5xjGlZKNEW6/R shzzZdcXGnemurVAS4CCeYHClnksDclBZLLaD8C1TsoUm1qA+ikW8NSXEZnQJjwbZBfo XL3olFn2Zt357w4EPalgNzGbDAb7RsiWHyuU+Tn/Bv3JrW6Bxv5n0TdDdW9XU3JGd7nW I8eSo0aV68l3m81H+lIN9dkAExg9Efa1+UvwXXsezw3pt6fwGId/GiqN//xX+xRacqJh q1hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=WnIesWyzgkP1beLu0wv1uoWqrkOzbsPNcoQRzCjbCb4=; b=lB2JquJVNzndyExM/1o0HB1nhWc9YAuLLhD05SOHWE28R9p1OKZvEEnQELpr4MRimW Y7K/6Onx4qewGmthPZVoAiAbYT4muGBi72ZBTmW9wItaMj2l489do1YnDEXfNtzM3utr fqGZFV5JuwXN9U4wJwL2b0GTQGkGSWmQSmN5k8u7Q66+XTy5pRWsZXO0UyMxNm7oHXmc 9XwYYpgM8GGEH6DcWquSDBjzVBrJHYagpTDo5+4vKAOXDfsWKKmIzsqgTplAGbbf6e01 RTpcY6QdmdNN8WFAWjhAWjiaNHQwE5mEgyBWWaJcI7iWvmL4MdEosOMVsBSJAWpZYiFJ lJJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=aLnMTMba; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f13-20020a056a001acd00b00560cc24dcdcsi3002823pfv.270.2022.10.29.12.03.20; Sat, 29 Oct 2022 12:03:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=aLnMTMba; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229632AbiJ2SFU (ORCPT + 99 others); Sat, 29 Oct 2022 14:05:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229489AbiJ2SFS (ORCPT ); Sat, 29 Oct 2022 14:05:18 -0400 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 339965972D for ; Sat, 29 Oct 2022 11:05:15 -0700 (PDT) Received: by mail-pj1-x1029.google.com with SMTP id r61-20020a17090a43c300b00212f4e9cccdso12583333pjg.5 for ; Sat, 29 Oct 2022 11:05:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WnIesWyzgkP1beLu0wv1uoWqrkOzbsPNcoQRzCjbCb4=; b=aLnMTMbaObzfBmNbzgXJVZXOaL1AxngryjGDYj8RIA2BREsOX0sB3JPeo4JAhgVGnn 1eIjcFmRichJlhHMqzWyLQKziGgfzEytwHjTx0q2pAhLTC3GnvAyKRX2fRFDyYyW/JaN 4mcJJO8m+qbZsWFfrtHUhZnWk1JtvV0vQY8sX8tk9v4/q5pHWim2zII2cBT9Ltz+ZGWm u4o5s14bofcuceHOMb6pJ/lDJLBmSJd9nOvi4Oafrlqo7LUjw5AD6HRDEa0/eLLmPqWu Y7wuPDE+ZUBov6s/076Tb1FyolulkfhFqSth1jNF/TA0c3sDTfA0pD+QkAf/S90kS5ig axTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WnIesWyzgkP1beLu0wv1uoWqrkOzbsPNcoQRzCjbCb4=; b=pMvS9bWOKwzeONHa7NNzFZcQFEk9KPtzUDFKj89XIavnBwlIkAk3lSQHI0XyY9Hbzk HW7TCjv87ucccIbY6N9ZusCpGqlh1wpPWVyXd/yTXxjxvYLNwXVL4IZx8zookVuJFSnX ge9i86IQOFJibtF9tmyoVzQXUwEgNws5R9KslgWCFJ+ABUrOqVVCDJlNdIz4IZwW2A/f A3pXwFN7/4UXk/jXXyDnV96gXmKJXrTxI3xCx55S47SJpQvfzK63JF/m5WJfLPQc2pnM OUoxF6qYt7HHuh2PAufZ7HxhnRrOoqWVKZWW3omU80RaYesNHNbQjVBMNofgn6/PR5TZ bjJg== X-Gm-Message-State: ACrzQf3nA4/pPAQs8w6YYbfwjlCWSPMtLMvKYyFCTJO12WdGiwWMz7Oq UDLxPXRjQj4u/O0B40YGHgA= X-Received: by 2002:a17:902:e74a:b0:186:a094:1d3 with SMTP id p10-20020a170902e74a00b00186a09401d3mr5374456plf.153.1667066714270; Sat, 29 Oct 2022 11:05:14 -0700 (PDT) Received: from smtpclient.apple (c-24-6-216-183.hsd1.ca.comcast.net. [24.6.216.183]) by smtp.gmail.com with ESMTPSA id b7-20020a170903228700b0017f5ad327casm736401plh.103.2022.10.29.11.05.12 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Oct 2022 11:05:13 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment From: Nadav Amit In-Reply-To: Date: Sat, 29 Oct 2022 11:05:12 -0700 Cc: Peter Zijlstra , Jann Horn , John Hubbard , X86 ML , Matthew Wilcox , Andrew Morton , kernel list , Linux-MM , Andrea Arcangeli , "Kirill A . Shutemov" , jroedel@suse.de, ubizjak@gmail.com, Alistair Popple Content-Transfer-Encoding: quoted-printable Message-Id: <47678198-C502-47E1-B7C8-8A12352CDA95@gmail.com> References: <20221022111403.531902164@infradead.org> <20221022114424.515572025@infradead.org> <2c800ed1-d17a-def4-39e1-09281ee78d05@nvidia.com> <6C548A9A-3AF3-4EC1-B1E5-47A7FFBEB761@gmail.com> To: Linus Torvalds X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Oct 28, 2022, at 5:42 PM, Linus Torvalds = wrote: > I think the proper fix (or at least _a_ proper fix) would be to > actually carry the dirty bit along to the __tlb_remove_page() point, > and actually treat it exactly the same way as the page pointer itself > - set the page dirty after the TLB flush, the same way we can free the > page after the TLB flush. >=20 > We could easiy hide said dirty bit in the low bits of the > "batch->pages[]" array or something like that. We'd just have to add > the 'dirty' argument to __tlb_remove_page_size() and friends. Thank you for your quick response. I was slow to respond due to a jet = lag. Anyhow, I am not sure whether the solution that you propose would work. Please let me know if my understanding makes sense. Let=E2=80=99s assume that we do not call set_page_dirty() before we = remove the rmap but only after we invalidate the page [*]. Let=E2=80=99s assume that shrink_page_list() is called after the page=E2=80=99s rmap is removed = and the page is no longer mapped, but before set_page_dirty() was actually called. In such a case, shrink_page_list() would consider the page clean, and = would indeed keep the page (since __remove_mapping() would find elevated page refcount), which appears to give us a chance to mark the page as dirty later. However, IIUC, in this case shrink_page_list() might still call filemap_release_folio() and release the buffers, so calling = set_page_dirty() afterwards - after the actual TLB invalidation took place - would fail. > Your idea of "do the page_remove_rmap() late instead" would also work, > but the reason I think just squirrelling away the dirty bit is the > "proper" fix is that it would get rid of the whole need for > 'force_flush' in this area entirely. So we'd not only fix that race > you noticed, we'd actually do so and reduce the number of TLB flushes > too. I=E2=80=99m all for reducing the number of TLB flushes, and your = solution does sound better in general. I proposed something that I considered having the = path of least resistance (i.e., least chance of breaking something). I can do = what you propsosed, but I am not sure how to deal with the buffers being = removed. One more note: This issue, I think, also affects = migrate_vma_collect_pmd(). Alistair recently addressed an issue there, but in my prior feedback to = him I missed this issue. [*] Note that this would be for our scenario pretty much the same if we = also called set_page_dirty() before removing the rmap, but the page was = cleaned while the TLB invalidation has still not been performed.