Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934293AbcJTXd0 (ORCPT ); Thu, 20 Oct 2016 19:33:26 -0400 Received: from ipmail05.adl6.internode.on.net ([150.101.137.143]:29034 "EHLO ipmail05.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933781AbcJTXdV (ORCPT ); Thu, 20 Oct 2016 19:33:21 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DoCwDmUwlYIGnXLHlcHAEBBAEBCgEBgz4BAQEBAR2BVIJ5g3mGQpVYAQEBBoEbjAeGJ4IPggiGGwQCAoF+QBQBAgEBAQEBAQEGAQEBAQEBOUWEYwEBBDocIxAIAw4KCSUPBSUDBxoTG4g2w3EBAQgCJR6De4FZhSCEM4VzBZoOkAKQBox/hAAegQAGCIUMKjSGa4IuAQEB Date: Fri, 21 Oct 2016 10:22:39 +1100 From: Dave Chinner To: Stephen Bates Cc: Dan Williams , "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" , linux-rdma@vger.kernel.org, linux-block@vger.kernel.org, Linux MM , Ross Zwisler , Matthew Wilcox , jgunthorpe@obsidianresearch.com, haggaie@mellanox.com, Christoph Hellwig , Jens Axboe , Jonathan Corbet , jim.macdonald@everspin.com, sbates@raithin.com, Logan Gunthorpe , David Woodhouse , "Raj, Ashok" Subject: Re: [PATCH 0/3] iopmem : A block device for PCIe memory Message-ID: <20161020232239.GQ23194@dastard> References: <1476826937-20665-1-git-send-email-sbates@raithlin.com> <20161019184814.GC16550@cgy1-donard.priv.deltatee.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161019184814.GC16550@cgy1-donard.priv.deltatee.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2301 Lines: 60 On Wed, Oct 19, 2016 at 12:48:14PM -0600, Stephen Bates wrote: > On Tue, Oct 18, 2016 at 08:51:15PM -0700, Dan Williams wrote: > > [ adding Ashok and David for potential iommu comments ] > > > > Hi Dan > > Thanks for adding Ashok and David! > > > > > I agree with the motivation and the need for a solution, but I have > > some questions about this implementation. > > > > > > > > Consumers > > > --------- > > > > > > We provide a PCIe device driver in an accompanying patch that can be > > > used to map any PCIe BAR into a DAX capable block device. For > > > non-persistent BARs this simply serves as an alternative to using > > > system memory bounce buffers. For persistent BARs this can serve as an > > > additional storage device in the system. > > > > Why block devices? I wonder if iopmem was initially designed back > > when we were considering enabling DAX for raw block devices. However, > > that support has since been ripped out / abandoned. You currently > > need a filesystem on top of a block-device to get DAX operation. > > Putting xfs or ext4 on top of PCI-E memory mapped range seems awkward > > if all you want is a way to map the bar for another PCI-E device in > > the topology. > > > > If you're only using the block-device as a entry-point to create > > dax-mappings then a device-dax (drivers/dax/) character-device might > > be a better fit. > > > > We chose a block device because we felt it was intuitive for users to > carve up a memory region but putting a DAX filesystem on it and creating > files on that DAX aware FS. It seemed like a convenient way to > partition up the region and to be easily able to get the DMA address > for the memory backing the device. You do realise that local filesystems can silently change the location of file data at any point in time, so there is no such thing as a "stable mapping" of file data to block device addresses in userspace? If you want remote access to the blocks owned and controlled by a filesystem, then you need to use a filesystem with a remote locking mechanism to allow co-ordinated, coherent access to the data in those blocks. Anything else is just asking for ongoing, unfixable filesystem corruption or data leakage problems (i.e. security issues). Cheers, Dave. -- Dave Chinner david@fromorbit.com