Received: by 2002:a25:b323:0:0:0:0:0 with SMTP id l35csp1376633ybj; Fri, 20 Sep 2019 09:27:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqwgQhq0pgph/x5yoaFuXAJza8XPS7Nf6H/4dYRxlQIM4VONVLGlrf4SQQaLjcg+lrqlJLdx X-Received: by 2002:a17:906:1ed6:: with SMTP id m22mr8566460ejj.135.1568996868810; Fri, 20 Sep 2019 09:27:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568996868; cv=none; d=google.com; s=arc-20160816; b=lhLS+lbHQ2jy4ojLsKPMFCSH+iwFZ97styXCSFxKR+BhYuGd+XLgA0iAt7DFP0x4Rf 1QzCJPrJi19wa+mfU0on/2F7Oe5ystOOrnSusoTyxBBXryukZgvRDwRKCz6ZMbHe8yn7 /KS3R10XS3J868qoRU8yHA6sDqS+qO/urNzFuetybyfTWU4hdpFeRbbWRW/eDc2bYU4j 9jSxbohF4THPY/dE8D1g+nk+35pDO0ulrRRWwTbvSHXxp8Qc5lTUp8yLUl+wDTYae/hL Tess/9NIuCQ9+Z6k9Y/N9KI7tDQdKKn63TZcy2zRa0VqilCtDVN7Zh92QI5qIR4uVf8U oZcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date; bh=GVaSgb562lfYHSqnrw6KBEiRlTjSiJmGaUVAUEdqDPs=; b=joQS2ICAkJGA7c4wM2vw5PQ00QzAO05V1UI+nfMDlx0Ct+a0oxkklLVdVLtvRm3oXv xvqngYMQUu0EB1hTBNnhQrDezedXqFhNCX83/LJmWzFoMMLBG0XDA/PNo5JWs7l1o3Nz EqmMVUi8FbpDUD+kdkK9f/ePUlgstGKL//idXkFna+0A2TnRUSH9MmwV36RnWQhG4ii/ zBZ3aAQBcZ19M/dBuYFr+2ai+CEdDMb1UnsEYSDldDCSKPlgEPQviBXWvak0CwTewY0N qKX00Z4srgm5E55Rvec+pD0FyJoMfIYtSNxmCAYoiefEY8qNia+CMXLmAvljGKOHwqCT UDrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d24si1692898ede.119.2019.09.20.09.27.24; Fri, 20 Sep 2019 09:27:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437074AbfITCSm (ORCPT + 99 others); Thu, 19 Sep 2019 22:18:42 -0400 Received: from mga12.intel.com ([192.55.52.136]:26559 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437065AbfITCSm (ORCPT ); Thu, 19 Sep 2019 22:18:42 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Sep 2019 19:18:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,526,1559545200"; d="scan'208";a="194565338" Received: from richard.sh.intel.com (HELO localhost) ([10.239.159.54]) by FMSMGA003.fm.intel.com with ESMTP; 19 Sep 2019 19:18:39 -0700 Date: Fri, 20 Sep 2019 10:18:21 +0800 From: Wei Yang To: Dave Hansen Cc: Wei Yang , dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] x86/mm: fix return value of p[um]dp_set_access_flags Message-ID: <20190920021821.GA8472@richard> Reply-To: Wei Yang References: <20190919082549.3895-1-richardw.yang@linux.intel.com> <307c9866-c037-5d87-709f-840bdb577283@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <307c9866-c037-5d87-709f-840bdb577283@intel.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 19, 2019 at 10:25:05AM -0700, Dave Hansen wrote: >On 9/19/19 1:25 AM, Wei Yang wrote: >> Function p[um]dp_set_access_flags is used with update_mmu_cache_p[um]d >> and the return value from p[um]dp_set_access_flags indicates whether it >> is necessary to do the cache update. > >If this change is correct, why was it not applied to >ptep_set_access_flags()? That function has the same form. > Thanks, you are right, I missed this point. >BTW, I rather dislike the 'dirty' variable name. It seems to be >indicating whether the fault was a write fault or not and whether we >*expect* the dirty bit to be set in 'entry'. > I found some comments for ptep_set_access_flags in OLD code, which says /* * We only update the dirty/accessed state if we set * the dirty bit by hand in the kernel, since the hardware * will do the accessed bit for us, and we don't want to * race with other CPU's that might be updating the dirty * bit at the same time. */ I guess this is reason for the naming. Looks currently we use this value in some other way. >> From current code logic, only when changed && dirty, related page table >> entry would be updated. It is not necessary to update cache when the >> real page table entry is not changed. > >This logic doesn't really hold up, though. > >If we are only writing accessed and/or dirty bits, then we *never* need >to flush. The flush might avoid a stale TLB entry causing an extra page >walk by the hardware, but it's probably not ever worth the cost of the >flush. > >Unless there's something weird happening with paravirt, I can't ever see >a good reason to flush the TLB when just setting accessed/dirty bits on >bare-metal. > >This seems like a place where a debugging check to validate that only >accessed/dirty bits are only being set would be a good idea. The function update_mmu_cache_pmd is not for x86, IMHO, since it is empty on x86. I took a look into the definition on powerpc, which is defined in arch/powerpc/mm/book3s64/pgtable.c, which preload the content of this HPTE. So the cache update here is not TLB flush on x86. And then I compare the definition of ptep_set_access_flags on x86 and powerpc. On powerpc, which is defined in arch/powerpc/mm/pgtable.c, update the pte without checking *dirty*. This logic make sense to me. So I am confused with why on x86 we need to check both *changed* and *dirty*. Then I searched the history, and found commit 8dab5241d06bf may explain something. The commit log says: This patch reworks ptep_set_access_flags() semantics, implementations and callers so that it's now responsible for returning whether an update is necessary or not (basically whether the PTE actually changed). If my understanding is correct, the return value should indicate whether the PTE actually changed. But the change introduced in commit 8dab5241d06bf is not doing the exact thing on x86. Since we only change PTE only both *dirty* and *change*, just return *changed* seems not match the semantics. Last but not least, since update_mmu_cache_pmd is empty, even return value is not correct, it doesn't break anything. Hope my understanding is correct. -- Wei Yang Help you, Help me