Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp1698942rwi; Thu, 27 Oct 2022 20:24:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7iPROgjk3Df8yHXIUdyY5ns08KzzVVZtN1QmB0u/tdPXN8XvitttZUHieAMluq2QUY6GO/ X-Received: by 2002:a62:4c6:0:b0:55f:c739:51e0 with SMTP id 189-20020a6204c6000000b0055fc73951e0mr52044816pfe.49.1666927479045; Thu, 27 Oct 2022 20:24:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666927479; cv=none; d=google.com; s=arc-20160816; b=FAiMBiD0TAch5kC+37bu5gFEg4BlxHvbPfWaBejYoY0fYgEv1gvs3q3EuS/tS7SY1M 4VrAmRpiTzSRrJTa3VKM6lenQc1PIKByz/0yHJffkxsEuLzsSAxL4CQVKyT419DUkvsE 82XV8amfINtbPL+QGp+6nRYgkKUlnx+9tXBoJ4Qyfgpf8dGl+E7HVsdtn8BBtupHQkYd oqe+4zf1RWLSZ3Ajzxz1QDpTXY2Hj2izNnweGuNx5WfKAu3xsN7fzn74r3Q9nT0waK3V TeMmxwVb1huZBl8SGGLDlMcDnmrDZv8e3WP/y0eY4tDvlxFwN6mFA4RUc6iJVXhJWdO7 tm8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=LtyNe/u0nvFC2TL9u+lpCBPQlPtLBOrWOAoa3P8Ei4Q=; b=P8LPfKOHJFhHVnsQeLMHdJRKDNR85K/bexziP1rpN9LG2MDM26HUkR0BmhBbIA7L/1 zWppAY2FCB2w4pTLXQPVR2ivAd9On3qkUNpuUzLy8acSyS8sml1/G8j88vT/hUnJ/9fC yyKZ9LZx2+k15i6FduPlpGimLSKf2HCP8IDQyYj+U/KIHKd6vMPG9J/MmL1HvI3WoEij lIl6nOCPybK21vphUf83bZMjgQEsL0wosfZ4/FC2/y2zkVmGzbXSF/2pGzaYV0lZrFZH dln608PvXdhvh/40Z1NDn3v+3bb1BT6MXEtJN2uhei9LSVVM+COi4yQWHBv5tOSJM5Mf 03VQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="qP/NzNJc"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k1-20020a056a00134100b00535d7265923si3728908pfu.377.2022.10.27.20.24.24; Thu, 27 Oct 2022 20:24:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="qP/NzNJc"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236476AbiJ1Cpd (ORCPT + 99 others); Thu, 27 Oct 2022 22:45:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235668AbiJ1Cpb (ORCPT ); Thu, 27 Oct 2022 22:45:31 -0400 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8F73638EA for ; Thu, 27 Oct 2022 19:45:29 -0700 (PDT) Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1666925127; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LtyNe/u0nvFC2TL9u+lpCBPQlPtLBOrWOAoa3P8Ei4Q=; b=qP/NzNJcQxOSxSm2A1wDZ4YPJtUJyJDcU3DJtiqmfnO48k9YlfJwVfJBMk8DRW3pJM/CnL Id8djApnsbm0Yb9U1syxXwSRSTTkAKMM/yu8ueqa9OQVc30WlcaXnZXrd0Vu3Vpzg9ayBb uYGODmB7zB/byr4ngwn4EgB0dKGDDhI= MIME-Version: 1.0 Subject: Re: [PATCH -next 1/1] mm: hugetlb_vmemmap: Fix WARN_ON in vmemmap_remap_pte X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: Date: Fri, 28 Oct 2022 10:45:09 +0800 Cc: Anshuman Khandual , Wupeng Ma , Andrew Morton , Mike Kravetz , Muchun Song , Michal Hocko , Oscar Salvador , Linux Memory Management List , linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <20221025014215.3466904-1-mawupeng1@huawei.com> <614E3E83-1EAB-4C39-AF9C-83C0CCF26218@linux.dev> <35dd51eb-c266-f221-298a-21309c17971a@arm.com> <3D6FDA43-A812-4907-B9C8-C2B25567DBBC@linux.dev> <3c545133-71aa-9a8d-8a13-09186c4fa767@arm.com> To: Catalin Marinas X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Oct 27, 2022, at 18:50, Catalin Marinas = wrote: >=20 > On Wed, Oct 26, 2022 at 02:06:00PM +0530, Anshuman Khandual wrote: >> On 10/26/22 12:31, Muchun Song wrote: >>>> On 10/25/22 12:06, Muchun Song wrote: >>>>>> On Oct 25, 2022, at 09:42, Wupeng Ma = wrote: >>>>>> From: Ma Wupeng >>>>>>=20 >>>>>> Commit f41f2ed43ca5 ("mm: hugetlb: free the vmemmap pages = associated with >>>>>> each HugeTLB page") add vmemmap_remap_pte to remap the tail pages = as >>>>>> read-only to catch illegal write operation to the tail page. >>>>>>=20 >>>>>> However this will lead to WARN_ON in arm64 in = __check_racy_pte_update() >>>>>=20 >>>>> Thanks for your finding this issue. >>>>>=20 >>>>>> since this may lead to dirty state cleaned. This check is = introduced by >>>>>> commit 2f4b829c625e ("arm64: Add support for hardware updates of = the >>>>>> access and dirty pte bits") and the initial check is as follow: >>>>>>=20 >>>>>> BUG_ON(pte_write(*ptep) && !pte_dirty(pte)); >>>>>>=20 >>>>>> Since we do need to mark this pte as read-only to catch illegal = write >>>>>> operation to the tail pages, use set_pte to replace set_pte_at = to bypass >>>>>> this check. >>>>>=20 >>>>> In theory, the waring does not affect anything since the tail = vmemmap >>>>> pages are supposed to be read-only. So, skipping this check for = vmemmap >>>>=20 >>>> Tails vmemmap pages are supposed to be read-only, in practice but = their >>>> backing pages do have pte_write() enabled. Otherwise the = VM_WARN_ONCE() >>>> warning would not have triggered. >>>=20 >>> Right. >>>=20 >>>>=20 >>>> VM_WARN_ONCE(pte_write(old_pte) && !pte_dirty(pte), >>>> "%s: racy dirty state clearing: 0x%016llx -> = 0x%016llx", >>>> __func__, pte_val(old_pte), pte_val(pte)); >>>>=20 >>>> Also, is not it true that the pte being remapped into a different = page >>>> as read only, than what it had originally (which will be freed up) = i.e=20 >>>> the PFN in 'old_pte' and 'pte' will be different. Hence is there = still >>>=20 >>> Right. >>>=20 >>>> a possibility for a race condition even when the PFN changes ? >>>=20 >>> Sorry, I didn't get this question. Did you mean the PTE is changed = from >>> new (pte) to the old one (old_pte) by the hardware because of the = update >>> of dirty bit when a concurrent write operation to the tail vmemmap = page? >>=20 >> No, but is not vmemmap_remap_pte() reuses walk->reuse_page for all = remaining >> tails pages ? Is not there a PFN change, along with access permission = change >> involved in this remapping process ? >=20 > For the record, as we discussed offline, changing the output address > (pfn) of a pte is not safe without break-before-make if at least one = of > the mappings was writeable. The caller (vmemmap_remap_pte()) would = need > to be fixed to first invalidate the pte and then write the new pte. I Hi Catalin, Could you expose more details about what issue it will be caused? I am not familiar with arm64. > assume no other CPU accesses this part of the vmemmap while the pte is > being remapped. However, there is no guarantee that no other CPU accesses this pte. E.g. memory failure or memory compaction, both can obtain head page from any tail struct pages (only read) anytime. Thanks. >=20 > --=20 > Catalin