Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp215484rwb; Thu, 18 Aug 2022 02:28:49 -0700 (PDT) X-Google-Smtp-Source: AA6agR4JFEfuRLR7BLURlAznn9azK5pr//gmyJwNieYVPrWHrL/U2vwwJcsVkLRGxSQKgOXU0/Xm X-Received: by 2002:a63:461f:0:b0:429:fb1a:c367 with SMTP id t31-20020a63461f000000b00429fb1ac367mr1831717pga.217.1660814929660; Thu, 18 Aug 2022 02:28:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660814929; cv=none; d=google.com; s=arc-20160816; b=p0Fw0BhspgQlgDVToeH1dkkLTNODuNX56H222HXnhXqWDqVbPvIw2Nsa+NctiNUQ9P LU+qWLEqBGA7+6D1xUXsthet8hdO9iNb3zR/ND4YE0v3qtDCuTLoy2GCATYx3tf7zBq+ wOBaS01DIWtFb4UqHFA7X5mkLncmzRxhupqms7DXUFrGhGZHObTP9+vrng9Fgnz5q/V/ 7d4MNbxGjFG0qHBVcbysQme7ww5Awi1tMnaYYTciPr7YJ130Inuv1tTW+ekKRNDaMxZx ofif5qH830bEAhEadwlvSWpcqTkQFy272asJCx8Oe+ZVYgXuJdMHregYsVGC3wazJmwx gRxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=qyEVh1JMxIRxomH1QldNUUxAXM0JxLRF2OseZ5hCiag=; b=pMHgQUBpYntHwev9ay7jjdqlIBRlkk37M9gy6BcfDdMhQVzzJq7O5Yor4BjVi0Q/VS HjajzxWTgRUXyigIva3E9bo6wU5VvV6HSLbAgKsgRSizFGU8+QN1t1j64rrsW63NH5z9 IC7p4NEI3dypM95omwAkz8VtIhSq80osIWrQyzgTwd1U7ewL//MMVaQG7CW8ewI27xwX vkjzDBvbRicSyO4vvRRw4LecZb01dXjLQps+cQafZnL5AvQ+s+k44pFwhd4XORDOkg94 y/7DyUTWSRMcxlsoAVdTqWUf2fTlqLb9V9bYLTj6FRFeYu8KcyZgFhD9apq6B8N1ArIJ dtug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=ZVtxOChp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l18-20020a170902d35200b00171230490b8si907934plk.129.2022.08.18.02.28.37; Thu, 18 Aug 2022 02:28:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=ZVtxOChp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243771AbiHRJTn (ORCPT + 99 others); Thu, 18 Aug 2022 05:19:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239604AbiHRJTR (ORCPT ); Thu, 18 Aug 2022 05:19:17 -0400 Received: from out0.migadu.com (out0.migadu.com [IPv6:2001:41d0:2:267::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 102824BD14 for ; Thu, 18 Aug 2022 02:19:05 -0700 (PDT) Content-Type: text/plain; charset=utf-8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1660814331; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qyEVh1JMxIRxomH1QldNUUxAXM0JxLRF2OseZ5hCiag=; b=ZVtxOChpvztE7Y9cYR+DgLHUFxbc9/f+Q9sG1z+hTRTMIen4OoZ8KmIHj9RVQA4Iemblth 3ER/O594tQoAdf6Bfid48AeqsHEQkxqJNtbJfZaCtjmu1qElXZHwFpaui1iaFiJa8THtgv dBUummR8HbYR3rk31ICu2CPt7uVXOFI= MIME-Version: 1.0 Subject: Re: [PATCH 4/6] mm: hugetlb_vmemmap: add missing smp_wmb() before set_pte_at() X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <7408156a-f708-5e73-d0a2-69b1acca9b96@intel.com> Date: Thu, 18 Aug 2022 17:18:09 +0800 Cc: Andrew Morton , Mike Kravetz , Muchun Song , Linux MM , linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: <15DD6DCA-39BC-4EA2-984F-D488E94CC4FF@linux.dev> References: <20220816130553.31406-1-linmiaohe@huawei.com> <20220816130553.31406-5-linmiaohe@huawei.com> <0EAF1279-6A1C-41FA-9A32-414C36B3792A@linux.dev> <019c1272-9d01-9d51-91a0-2d656b25c318@intel.com> <18adbf89-473e-7ba6-9a2b-522e1592bdc6@huawei.com> <9c791de0-b702-1bbe-38a4-30e87d9d1b95@intel.com> <931536E2-3948-40AB-88A7-E36F67954AAA@linux.dev> <7be98c64-88a1-3bee-7f92-67bb1f4f495b@huawei.com> <3B1463C2-9DC4-43D0-93EC-2D2334A20502@linux.dev> <7fa5b2b2-dcef-f264-7932-c4fdbb9619d0@intel.com> <7408156a-f708-5e73-d0a2-69b1acca9b96@intel.com> To: "Yin, Fengwei" , Miaohe Lin X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Aug 18, 2022, at 16:54, Yin, Fengwei wrote: >=20 >=20 >=20 > On 8/18/2022 4:40 PM, Muchun Song wrote: >>=20 >>=20 >>> On Aug 18, 2022, at 16:32, Yin, Fengwei = wrote: >>>=20 >>>=20 >>>=20 >>> On 8/18/2022 3:59 PM, Muchun Song wrote: >>>>=20 >>>>=20 >>>>> On Aug 18, 2022, at 15:52, Miaohe Lin = wrote: >>>>>=20 >>>>> On 2022/8/18 10:47, Muchun Song wrote: >>>>>>=20 >>>>>>=20 >>>>>>> On Aug 18, 2022, at 10:00, Yin, Fengwei = wrote: >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>> On 8/18/2022 9:55 AM, Miaohe Lin wrote: >>>>>>>>>>> /* >>>>>>>>>>> * The memory barrier inside __SetPageUptodate makes = sure that >>>>>>>>>>> * preceding stores to the page contents become visible = before >>>>>>>>>>> * the set_pte_at() write. >>>>>>>>>>> */ >>>>>>>>>>> __SetPageUptodate(page); >>>>>>>>>> IIUC, the case here we should make sure others (CPUs) can see = new page=E2=80=99s >>>>>>>>>> contents after they have saw PG_uptodate is set. I think = commit 0ed361dec369 >>>>>>>>>> can tell us more details. >>>>>>>>>>=20 >>>>>>>>>> I also looked at commit 52f37629fd3c to see why we need a = barrier before >>>>>>>>>> set_pte_at(), but I didn=E2=80=99t find any info to explain = why. I guess we want >>>>>>>>>> to make sure the order between the page=E2=80=99s contents = and subsequent memory >>>>>>>>>> accesses using the corresponding virtual address, do you = agree with this? >>>>>>>>> This is my understanding also. Thanks. >>>>>>>> That's also my understanding. Thanks both. >>>>>>> I have an unclear thing (not related with this patch directly): = Who is response >>>>>>> for the read barrier in the read side in this case? >>>>>>>=20 >>>>>>> For SetPageUptodate, there are paring write/read memory barrier. >>>>>>>=20 >>>>>>=20 >>>>>> I have the same question. So I think the example proposed by = Miaohe is a little >>>>>> difference from the case (hugetlb_vmemmap) here. >>>>>=20 >>>>> Per my understanding, memory barrier in PageUptodate() is needed = because user might access the >>>>> page contents using page_address() (corresponding pagetable entry = already exists) soon. But for >>>>> the above proposed case, if user wants to access the page = contents, the corresponding pagetable >>>>> should be visible first or the page contents can't be accessed. So = there should be a data dependency >>>>> acting as memory barrier between pagetable entry is loaded and = page contents is accessed. >>>>> Or am I miss something? >>>>=20 >>>> Yep, it is a data dependency. The difference between = hugetlb_vmemmap and PageUptodate() is that >>>> the page table (a pointer to the mapped page frame) is loaded by = MMU while PageUptodate() is >>>> loaded by CPU. Seems like the data dependency should be inserted = between the MMU access and the CPU >>>> access. Maybe it is hardware=E2=80=99s guarantee? >>> I just found the comment in pmd_install() explained why most arch = has no read >>=20 >> I think pmd_install() is a little different as well. We should make = sure the >> page table walker (like GUP) see the correct PTE entry after they see = the pmd >> entry. >=20 > The difference I can see is that pmd/pte thing has both hardware page = walker and > software page walker (like GUP) as read side. While the case here only = has hardware > page walker as read side. But I suppose the memory barrier requirement = still apply > here. I am not against this change. Just in order to make me get a better = understanding of hardware behavior. >=20 > Maybe we could do a test: add large delay between reset_struct_page() = and set_pte_at? Hi Miaohe, Would you mind doing this test? One thread do vmemmap_restore_pte(), = another thread detect if it can see a tail page with PG_head after the previous thread = has executed set_pte_at(). Thanks. >=20 > Regards > Yin, Fengwei=20 >=20 >>=20 >>> side memory barrier except alpha which has read side memory barrier. >>=20 >> Right. Only alpha has data dependency barrier. >>=20 >>>=20 >>>=20 >>> Regards >>> Yin, Fengwei >>>=20 >>>>=20 >>>>>=20 >>>>> Thanks, >>>>> Miaohe Lin