Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8890760imu; Tue, 4 Dec 2018 16:27:51 -0800 (PST) X-Google-Smtp-Source: AFSGD/UgAT6GY9inZHV9TZNPA+LA7MwN+jBOql2U2XISRvE380s7qNOd9hRU2fmfPGru7XIL1Y9k X-Received: by 2002:a63:ed15:: with SMTP id d21mr18544386pgi.305.1543969671138; Tue, 04 Dec 2018 16:27:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543969671; cv=none; d=google.com; s=arc-20160816; b=wMYoIPD/LVO9MyPAnkx+vUmv0xKVepRu5RVUy0uJ8Vp3f0NFfH76TdtZEiRcQGFPut gWF2NXfVqGaShhitGxSnIwjaWM35awB6WY1v/uoA5f827ixKrPrA7KNkOC4iTZ2PALpx XWcBKKXpPFnwBEdYZVF9XtVw8B+daZIxPY/4VYzTnMjSaJi/dXwKoNfToa7pFsGilakE BbYz+n2mipnGGwyEJtcQpH2M+67kfKUns+RZWXX/4cHuY4R0Mklz0cbiNi1PE1ORBowl egCcU/4FiO11PvtXuwxfMHaAvtOF6yBuT+j2mKi8bgET/uY6XRlETvKD6LsFSMhQH3Zs eBdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=1KwpwloZKWw2r288NR2PSotNAVR8VQ9kiMJsfd3nibY=; b=nL0qfJ93NIGdTLGdQveDohQEolNY1pAcNs+1wX8ARY0NxM01rT3N6iS3ZrFgCABdsY xIPh/z/fr17JMUDvMB18U08mBBN2DyVSBmHlQwVK8Xst1pXCBHEiZIfwDyVoFRQ3IW45 +oLrIwguW4g7uM/st3yZbRHscyenjtoPkfh+ReSeYLzxpQtyQsq3Cj4qCLZcBLAm3WFN QPaJd1Q54/M1FDCKRy9dT6+bMb8A3Yxn9S200+e63tFoYzQBt9BmTscfLChhBCg1JEp4 riIBkvAbqnPWAoZwGs3T02K3JCj01lztGJxbXff0hXs5L7P31KJcmni8Rc3uqhcytmj8 JO1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=eFnPJ62w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h85si19713120pfd.27.2018.12.04.16.27.35; Tue, 04 Dec 2018 16:27:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=eFnPJ62w; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726391AbeLEA1A (ORCPT + 99 others); Tue, 4 Dec 2018 19:27:00 -0500 Received: from mail-ot1-f68.google.com ([209.85.210.68]:39441 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725961AbeLEA07 (ORCPT ); Tue, 4 Dec 2018 19:26:59 -0500 Received: by mail-ot1-f68.google.com with SMTP id n8so13173289otl.6 for ; Tue, 04 Dec 2018 16:26:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1KwpwloZKWw2r288NR2PSotNAVR8VQ9kiMJsfd3nibY=; b=eFnPJ62wB08z7V8csIVZJdAadTp6k1yMVyIZpCT0Nafn54N5yNTv9jxEP3MYhoURXl HFtc2oVgrb/QqbRoZj8hrme/U007glVODlp9llUectox/l3F9iagdwfNaV/6TZyzy0iH bK/qTp7nezDCqrVymoCFSafplLlWCKwOzLmH1BF4DMpR73kQE3lU09vl8RLdl9y+Ej86 JsbmjbUh4+KkxZ8RKimvzW92dmp/1NJConeURM8GDTWtEPwra3IpeTaJPah9dk4SX0wY +dzjpJG/6WTBMvGn6t7gbwmwmplIRv/OKkoXCN3fpyqFY8UMzlFjHKUuKyUI7qYuoaiB 3q1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1KwpwloZKWw2r288NR2PSotNAVR8VQ9kiMJsfd3nibY=; b=MIVorPWgYiWsXWal0+Ay/TMKRdEvbsONu8JIuH71xNe9BxlGigAv4oUTDEJPbE09vJ 22NTwf1tjTW0680sJSSMFxs9tR0ELvWEUwsOhSOnJIi6Z+3PjoIcG0TDfgiNoyYOoFn5 IFBGL+xe4axCQx2qbP6JHehO/DGOSn/6kV1Fp4rapm0F0kS7k/L+FC41JyhbptkCzgYE DtY4rr6kk7jTpK5okypX5wqQQAcYOXUy0dxlAUjB0HwKp96Gb7oHUUjC+9PlvnnRVYA9 g2qPsYUgZAaXb4RBLgSq6Fd7cHpGpiOSiQqkIBsMgy7IHxu2bmyTQgQrCo2hHIHwrRMl EK8g== X-Gm-Message-State: AA+aEWar9r0VywEpgbGbwN76ZCNK2aX1mwRbCZxKlwT2oW4N8p9nDGUk fFDcz+YLjTPeJzgxTPb16swh8betVt1oe62qeZDB0A== X-Received: by 2002:a9d:5cc2:: with SMTP id r2mr14354977oti.367.1543969618737; Tue, 04 Dec 2018 16:26:58 -0800 (PST) MIME-Version: 1.0 References: <154386493754.27193.1300965403157243427.stgit@ahduyck-desk1.amr.corp.intel.com> <154386513120.27193.7977541941078967487.stgit@ahduyck-desk1.amr.corp.intel.com> <97943d2ed62e6887f4ba51b985ef4fb5478bc586.camel@linux.intel.com> <2a3f70b011b56de2289e2f304b3d2d617c5658fb.camel@linux.intel.com> <30ab5fa569a6ede936d48c18e666bc6f718d50db.camel@linux.intel.com> <20181204182428.11bec385@gnomeregan.cam.corp.google.com> In-Reply-To: From: Dan Williams Date: Tue, 4 Dec 2018 16:26:46 -0800 Message-ID: Subject: Re: [PATCH RFC 2/3] mm: Add support for exposing if dev_pagemap supports refcount pinning To: alexander.h.duyck@linux.intel.com Cc: Barret Rhoden , Paolo Bonzini , Zhang Yi , KVM list , linux-nvdimm , Linux Kernel Mailing List , Linux MM , Dave Jiang , "Zhang, Yu C" , Pankaj Gupta , David Hildenbrand , Jan Kara , Christoph Hellwig , rkrcmar@redhat.com, =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 4, 2018 at 4:01 PM Alexander Duyck wrote: > > On Tue, 2018-12-04 at 18:24 -0500, Barret Rhoden wrote: > > Hi - > > > > On 2018-12-04 at 14:51 Alexander Duyck > > wrote: > > > > [snip] > > > > > > I think the confusion arises from the fact that there are a few MMIO > > > > resources with a struct page and all the rest MMIO resources without. > > > > The problem comes from the coarse definition of pfn_valid(), it may > > > > return 'true' for things that are not System-RAM, because pfn_valid() > > > > may be something as simplistic as a single "address < X" check. Then > > > > PageReserved is a fallback to clarify the pfn_valid() result. The > > > > typical case is that MMIO space is not caught up in this linear map > > > > confusion. An MMIO address may or may not have an associated 'struct > > > > page' and in most cases it does not. > > > > > > Okay. I think I understand this somewhat now. So the page might be > > > physically there, but with the reserved bit it is not supposed to be > > > touched. > > > > > > My main concern with just dropping the bit is that we start seeing some > > > other uses that I was not certain what the impact would be. For example > > > the functions like kvm_set_pfn_accessed start going in and manipulating > > > things that I am not sure should be messed with for a DAX page. > > > > One thing regarding the accessed and dirty bits is that we might want > > to have DAX pages marked dirty/accessed, even if we can't LRU-reclaim > > or swap them. I don't have a real example and I'm fairly ignorant > > about the specifics here. But one possibility would be using the A/D > > bits to detect changes to a guest's memory for VM migration. Maybe > > there would be issues with KSM too. > > > > Barret > > I get that, but the issue is that the code associated with those bits > currently assumes you are working with either an anonymous swap backed > page or a page cache page. We should really be updating that logic now, > and then enabling DAX to access it rather than trying to do things the > other way around which is how this feels. Agree. I understand the concern about unintended side effects of dropping PageReserved for dax pages, but they simply don't fit the definition of the intended use of PageReserved. We've already had fallout from legacy code paths doing the wrong thing with dax pages where PageReserved wouldn't have helped. For example, see commit 6e2608dfd934 "xfs, dax: introduce xfs_dax_aops", or commit 6100e34b2526 "mm, memory_failure: Teach memory_failure() about dev_pagemap pages". So formerly teaching kvm about these page semantics and dropping the reliance on a side effect of PageReserved() seems the right direction. That said, for mark_page_accessed(), it does not look like it will have any effect on dax pages. PageLRU will be false, __lru_cache_activate_page() will not find a page on a percpu pagevec, and workingset_activation() won't find an associated memcg. I would not be surprised if mark_page_accessed() is already being called today via the ext4 + dax use case.