From: Dan Williams Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io Date: Thu, 5 May 2016 08:15:32 -0700 Message-ID: References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com> <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com> <5727753F.6090104@plexistor.com> <20160505142433.GA4557@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Boaz Harrosh , linux-block@vger.kernel.org, linux-ext4 , Jan Kara , Matthew Wilcox , Dave Chinner , "linux-kernel@vger.kernel.org" , XFS Developers , Jens Axboe , Linux MM , Al Viro , linux-nvdimm , linux-fsdevel , Andrew Morton To: Christoph Hellwig Return-path: In-Reply-To: <20160505142433.GA4557@infradead.org> Sender: owner-linux-mm@kvack.org List-Id: linux-ext4.vger.kernel.org On Thu, May 5, 2016 at 7:24 AM, Christoph Hellwig wrote: > On Mon, May 02, 2016 at 06:41:51PM +0300, Boaz Harrosh wrote: >> > All IO in a dax filesystem used to go through dax_do_io, which cannot >> > handle media errors, and thus cannot provide a recovery path that can >> > send a write through the driver to clear errors. >> > >> > Add a new iocb flag for DAX, and set it only for DAX mounts. In the IO >> > path for DAX filesystems, use the same direct_IO path for both DAX and >> > direct_io iocbs, but use the flags to identify when we are in O_DIRECT >> > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the conventional >> > direct_IO path instead of DAX. >> > >> >> Really? What are your thinking here? >> >> What about all the current users of O_DIRECT, you have just made them >> 4 times slower and "less concurrent*" then "buffred io" users. Since >> direct_IO path will queue an IO request and all. >> (And if it is not so slow then why do we need dax_do_io at all? [Rhetorical]) >> >> I hate it that you overload the semantics of a known and expected >> O_DIRECT flag, for special pmem quirks. This is an incompatible >> and unrelated overload of the semantics of O_DIRECT. > > Agreed - makig O_DIRECT less direct than not having it is plain stupid, > and I somehow missed this initially. Of course I disagree because like Dave argues in the msync case we should do the correct thing first and make it fast later, but also like Dave this arguing in circles is getting tiresome. > This whole DAX story turns into a major nightmare, and I fear all our > hodge podge tweaks to the semantics aren't helping it. > > It seems like we simply need an explicit O_DAX for the read/write > bypass if can't sort out the semantics (error, writer synchronization) > just as we need a special flag for MMAP. I don't see how O_DAX makes this situation better if the goal is to accelerate unmodified applications... Vishal, at least the "delete a file with a badblock" model will still work for implicitly clearing errors with your changes to stop doing block clearing in fs/dax.c. This combined with a new -EBADBLOCK (as Dave suggests) and explicit logging of I/Os that fail for this reason at least gives a chance to communicate errors in files to suitably aware applications / environments. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org