Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp801125imm; Tue, 15 May 2018 09:21:19 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqB10CYag4DM6BEPxjBCIV/ls/XVBGvphQJ8sfOsgl7LcvzuFfo6MyJzD22RMpP4Kl25b6D X-Received: by 2002:a17:902:d681:: with SMTP id v1-v6mr14960998ply.16.1526401279214; Tue, 15 May 2018 09:21:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526401279; cv=none; d=google.com; s=arc-20160816; b=tcsyICTlQQxKjVO8g8Ks2QN1zFjLucec+N54b2ZH/Ei+sb2BidzrS7Zdg++2s4+dVr kymzvBKops15oVnt6w13pFbFWCh1fxRDLzF/FwE4dBQSNPdCJvCgiQhEnMpWJHJSkpgM b5bNdtpd0WX5H6GMTpdc6sPDRvTG57x2pD2rap9NNiuznknSgl2plLk83PMB0dZBbVjV wSzzcWQr5fV/MmHAI5cbQKtQ8N512b9vq/w+XCgQux/GkEqcRtirfqBnRoLE+k9t/omU 3Oy7eFweW4Huc0QoCu7db1FBXciaIDDZSDqSwTjUx+nLrjrzP+sqGiGbzhIKr5Kh5gH2 pzvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=6IRlQk1sf5CDFouJo2mSoY85ufhigDiEz9lE2hAwc1g=; b=yNjRjRgks9ZqLSioA/emhiKFyKByWOtil9tv9E19ekF1kD6jD6vVqiJ73jir+w0t/r lafL6fXLGQ59JaGRdKcTj/CyYU08B3kK9T8oTDDtebqlgwaukLqMhIwP71PSbzZqbOIb XCwGjBgR4wJu46SvtnAWHSaGbtpHPa8N7u5FI8Nx+l01ET2Wc1ouo09HZZQbJ1TmrtEA eyQF8fmUypEH9uGZoNj1u4fMA4VaG9/Qe61j4nfn5MXxDnj7Bnq4qnONF3X/LvHnxYTV JDnHa9k8xk7DJQ03gTHJ5Vfgk9Vlq86EPidqE3EpmI3XIgQ5HnUPT0zHfMhGbJ/bk/5b 7aJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=BBJtUn9n; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q80-v6si389476pfi.109.2018.05.15.09.21.04; Tue, 15 May 2018 09:21:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=BBJtUn9n; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753158AbeEOQUe (ORCPT + 99 others); Tue, 15 May 2018 12:20:34 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:39078 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752030AbeEOQUd (ORCPT ); Tue, 15 May 2018 12:20:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=6IRlQk1sf5CDFouJo2mSoY85ufhigDiEz9lE2hAwc1g=; b=BBJtUn9nrxKRkQ+iH2vFkNK0T 8YN7hxhYLQx1MMn7c+vHVJlyooTDxsR3ISk0AJAak/vISscwsRSY59VK1xith1FADK7HMcOB7dlA7 NDTlAETGx+YVho33iKR/0UEt+AVCpObqMoS/RnqVo7ubYJFlMklZ1QoxlOCGCnqo1OenXV4tp5ytQ HKu8yxxUP5O0u/yC5tv2knS/YPmyHiS9drWFTY1nPNJA9/Wc3eps2JjOGBvkEJzKYq7/65P8MgVW+ v/iAjLAFxISQ8YL4IQvMEVpB6GbMGyCe5x/hXLIODjkUW9Ww+XzMgvrUc9sT4atCWFjCHNLFe2cMW ByBm77lMw==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1fIcgG-0001rp-Fd; Tue, 15 May 2018 16:20:04 +0000 Date: Tue, 15 May 2018 09:20:03 -0700 From: Matthew Wilcox To: Huaisheng HS1 Ye Cc: Jeff Moyer , Dan Williams , Michal Hocko , linux-nvdimm , Tetsuo Handa , NingTing Cheng , Dave Hansen , Linux Kernel Mailing List , "pasha.tatashin@oracle.com" , Linux MM , "colyli@suse.de" , Johannes Weiner , Andrew Morton , Sasha Levin , Mel Gorman , Vlastimil Babka , Ocean HY1 He Subject: Re: [External] Re: [RFC PATCH v1 0/6] use mm to manage NVDIMM (pmem) zone Message-ID: <20180515162003.GA26489@bombadil.infradead.org> References: <1525704627-30114-1-git-send-email-yehs1@lenovo.com> <20180507184622.GB12361@bombadil.infradead.org> <20180508030959.GB16338@bombadil.infradead.org> <20180510162742.GA30442@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 15, 2018 at 04:07:28PM +0000, Huaisheng HS1 Ye wrote: > > From: owner-linux-mm@kvack.org [mailto:owner-linux-mm@kvack.org] On Behalf Of Matthew > > Wilcox > > No. In the current situation, the user knows that either the entire > > page was written back from the pagecache or none of it was (at least > > with a journalling filesystem). With your proposal, we may have pages > > splintered along cacheline boundaries, with a mix of old and new data. > > This is completely unacceptable to most customers. > > Dear Matthew, > > Thanks for your great help, I really didn't consider this case. > I want to make it a little bit clearer to me. So, correct me if anything wrong. > > Is that to say this mix of old and new data in one page, which only has chance to happen when CPU failed to flush all dirty data from LLC to NVDIMM? > But if an interrupt can be reported to CPU, and CPU successfully flush all dirty data from cache lines to NVDIMM within interrupt response function, this mix of old and new data can be avoided. If you can keep the CPU and the memory (and all the busses between them) alive for long enough after the power signal hs been tripped, yes. Talk to your hardware designers about what it will take to achieve this :-) Be sure to ask about the number of retries which may be necessary on the CPU interconnect to flush all data to an NV-DIMM attached to a remote CPU. > Current X86_64 uses N-way set associative cache, and every cache line has 64 bytes. > For 4096 bytes page, one page shall be splintered to 64 (4096/64) lines. Is it right? That's correct. > > > > Then there's the problem of reconnecting the page cache (which is > > > > pointed to by ephemeral data structures like inodes and dentries) to > > > > the new inodes. > > > Yes, it is not easy. > > > > Right ... and until we have that ability, there's no point in this patch. > We are focusing to realize this ability. But is it the right approach? So far we have (I think) two parallel activities. The first is for local storage, using DAX to store files directly on the pmem. The second is a physical block cache for network filesystems (both NAS and SAN). You seem to be wanting to supplant the second effort, but I think it's much harder to reconnect the logical cache (ie the page cache) than it is the physical cache (ie the block cache).