Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3728581imu; Mon, 28 Jan 2019 09:44:38 -0800 (PST) X-Google-Smtp-Source: ALg8bN7u/iVVw6ZwiorNSY6HGn8EnmXqHhGbzR0f1lvFhUQW6ucci0p69VXCkLDcIReO/8G+HGjR X-Received: by 2002:a63:413:: with SMTP id 19mr20314013pge.7.1548697478533; Mon, 28 Jan 2019 09:44:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548697478; cv=none; d=google.com; s=arc-20160816; b=Qyi7ZvD62Y4f/qkNMnPzSf6qtPkUHL3F4xoyjMEO15oVZsuU2clEa1BazxmZCpHuVC XDJWL2IslxX3IIr1kVqJJWVhA28xSdd8+PBoF14oO82EEMjBDXsQCo3s2ReF6Vla4BLZ WzqIkQyxiTa97JxazVQrYEKUxlEQmbZnO4b0ESuUSB9EtoAw69rKaUlGuxci0L5kvR64 Ur0uDLVTBIxjMo4BEvB1wvdO04kXSsvj5/myWPtbSgJ5dbKZzMDs0fbUrgMCz9DiGv92 wgJDvaCrZZKcA4jRzJSRcf+DIIMBulI3wPJZcIW0iO18QU2ylcZIeXrL+z4wnc5NlXJf 9emA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=7sqKBm6b2VKkB7E7B7WUF3lomqQBXd3MBh/YsvuyDc0=; b=Uwzyk1ZeintTscRMMOH5qJQZL0pvRuG/eB6xFlNl0x+2h2Oc0E/tQxCqFpgidlWd2A pK0vUywCqHpTThPo8cQ4iR+BQXfXJz1vjJEQaB1YV1VBI9J5Wull/eBsRx/+9EF7hW1V /gOdhfRu9Dg1cf4QoX1BiBodWRyI2mudkLLKv0EWqNnA7x0k4zX6yZxtB1zBGAElJsMk Nz69lJy4PjckOw2udbXJE0QijRDeedwftirfI2PpGMwmaIAzCMONTcGf3YNDfjPaqXpk OR0P4vaddx86zkWOSV6BDpf4BQ4NmqOOLeRXdxRGE2aqpHAriBKKB+a70X9BPb7N5sXi +ODg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f2si29584312plt.101.2019.01.28.09.44.23; Mon, 28 Jan 2019 09:44:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729030AbfA1RnA (ORCPT + 99 others); Mon, 28 Jan 2019 12:43:00 -0500 Received: from szxga07-in.huawei.com ([45.249.212.35]:60946 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728468AbfA1Rm6 (ORCPT ); Mon, 28 Jan 2019 12:42:58 -0500 Received: from DGGEMS402-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 3D9677F03AFBD8EF99CE; Tue, 29 Jan 2019 01:42:55 +0800 (CST) Received: from localhost (10.202.226.61) by DGGEMS402-HUB.china.huawei.com (10.3.19.202) with Microsoft SMTP Server id 14.3.408.0; Tue, 29 Jan 2019 01:42:52 +0800 Date: Mon, 28 Jan 2019 17:42:39 +0000 From: Jonathan Cameron To: Michal Hocko CC: Andrea Arcangeli , Huang Ying , Zhang Yi , , Dave Hansen , Liu Jingqi , Fan Du , Dong Eddie , LKML , , "Linux Memory Management List" , Peng Dong , Yao Yuan , Andrew Morton , Fengguang Wu , "Dan Williams" , Mel Gorman Subject: Re: [RFC][PATCH v2 00/21] PMEM NUMA node and hotness accounting/migration Message-ID: <20190128174239.0000636b@huawei.com> In-Reply-To: <20190102122110.00000206@huawei.com> References: <20181226131446.330864849@intel.com> <20181227203158.GO16738@dhcp22.suse.cz> <20181228050806.ewpxtwo3fpw7h3lq@wfg-t540p.sh.intel.com> <20181228084105.GQ16738@dhcp22.suse.cz> <20181228094208.7lgxhha34zpqu4db@wfg-t540p.sh.intel.com> <20181228121515.GS16738@dhcp22.suse.cz> <20181228133111.zromvopkfcg3m5oy@wfg-t540p.sh.intel.com> <20181228195224.GY16738@dhcp22.suse.cz> <20190102122110.00000206@huawei.com> Organization: Huawei X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; i686-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.226.61] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2 Jan 2019 12:21:10 +0000 Jonathan Cameron wrote: > On Fri, 28 Dec 2018 20:52:24 +0100 > Michal Hocko wrote: > > > [Ccing Mel and Andrea] > > Hi, I just wanted to highlight this section as I didn't feel we really addressed this in the earlier conversation. > * Hot pages may not be hot just because the host is using them a lot. It would be > very useful to have a means of adding information available from accelerators > beyond simple accessed bits (dreaming ;) One problem here is translation > caches (ATCs) as they won't normally result in any updates to the page accessed > bits. The arm SMMU v3 spec for example makes it clear (though it's kind of > obvious) that the ATS request is the only opportunity to update the accessed > bit. The nasty option here would be to periodically flush the ATC to force > the access bit updates via repeats of the ATS request (ouch). > That option only works if the iommu supports updating the accessed flag > (optional on SMMU v3 for example). > If we ignore the IOMMU hardware update issue which will simply need to be addressed by future hardware if these techniques become common, how do we address the Address Translation Cache issue without potentially causing big performance problems by flushing the cache just to force an accessed bit update? These devices are frequently used with PRI and Shared Virtual Addressing and can be accessing most of your memory without you having any visibility of it in the page tables (as they aren't walked if your ATC is well matched in size to your usecase. Classic example would be accelerated DB walkers like the the CCIX demo Xilinx has shown at a few conferences. The whole point of those is that most of the time only your large set of database walkers is using your memory and they have translations cached for for a good part of what they are accessing. Flushing that cache could hurt a lot. Pinning pages hurts for all the normal flexibility reasons. Last thing we want is to be migrating these pages that can be very hot but in an invisible fashion. Thanks, Jonathan