Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761190AbcJSUBJ (ORCPT ); Wed, 19 Oct 2016 16:01:09 -0400 Received: from mail-yw0-f172.google.com ([209.85.161.172]:36075 "EHLO mail-yw0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761117AbcJSUBI (ORCPT ); Wed, 19 Oct 2016 16:01:08 -0400 MIME-Version: 1.0 In-Reply-To: <20161019184028.GB16550@cgy1-donard.priv.deltatee.com> References: <1476826937-20665-1-git-send-email-sbates@raithlin.com> <1476826937-20665-2-git-send-email-sbates@raithlin.com> <20161019184028.GB16550@cgy1-donard.priv.deltatee.com> From: Dan Williams Date: Wed, 19 Oct 2016 13:01:06 -0700 Message-ID: Subject: Re: [PATCH 1/3] memremap.c : Add support for ZONE_DEVICE IO memory with struct pages. To: Stephen Bates Cc: "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" , linux-rdma@vger.kernel.org, linux-block@vger.kernel.org, Linux MM , Ross Zwisler , Matthew Wilcox , Jason Gunthorpe , haggaie@mellanox.com, Christoph Hellwig , Jens Axboe , Jonathan Corbet , jim.macdonald@everspin.com, sbates@raithin.com, Logan Gunthorpe Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2528 Lines: 52 On Wed, Oct 19, 2016 at 11:40 AM, Stephen Bates wrote: > On Wed, Oct 19, 2016 at 10:50:25AM -0700, Dan Williams wrote: >> On Tue, Oct 18, 2016 at 2:42 PM, Stephen Bates wrote: >> > From: Logan Gunthorpe >> > >> > We build on recent work that adds memory regions owned by a device >> > driver (ZONE_DEVICE) [1] and to add struct page support for these new >> > regions of memory [2]. >> > >> > 1. Add an extra flags argument into dev_memremap_pages to take in a >> > MEMREMAP_XX argument. We update the existing calls to this function to >> > reflect the change. >> > >> > 2. For completeness, we add MEMREMAP_WT support to the memremap; >> > however we have no actual need for this functionality. >> > >> > 3. We add the static functions, add_zone_device_pages and >> > remove_zone_device pages. These are similar to arch_add_memory except >> > they don't create the memory mapping. We don't believe these need to be >> > made arch specific, but are open to other opinions. >> > >> > 4. dev_memremap_pages and devm_memremap_pages_release are updated to >> > treat IO memory slightly differently. For IO memory we use a combination >> > of the appropriate io_remap function and the zone_device pages functions >> > created above. A flags variable and kaddr pointer are added to struct >> > page_mem to facilitate this for the release function. We also set up >> > the page attribute tables for the mapped region correctly based on the >> > desired mapping. >> > >> >> This description says "what" is being done, but not "why". > > Hi Dan > > We discuss the motivation in the cover letter. > >> >> In the cover letter, "[PATCH 0/3] iopmem : A block device for PCIe >> memory", it mentions that the lack of I/O coherency is a known issue >> and users of this functionality need to be cognizant of the pitfalls. >> If that is the case why do we need support for different cpu mapping >> types than the default write-back cache setting? It's up to the >> application to handle cache cpu flushing similar to what we require of >> device-dax users in the persistent memory case. > > Some of the iopmem hardware we have tested has certain alignment > restrictions on BAR accesses. At the very least we require write > combine mappings for these. We then felt it appropriate to add the > other mappings for the sake of completeness. If the device can support write-combine then it can support bursts, so I wonder why it couldn't support read bursts for cache fills...