Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755483AbbHYOBC (ORCPT ); Tue, 25 Aug 2015 10:01:02 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34499 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751316AbbHYOBA (ORCPT ); Tue, 25 Aug 2015 10:01:00 -0400 From: Jeff Moyer To: Dave Chinner Cc: Brian Norris , Artem Bityutskiy , Richard Weinberger , Dongsheng Yang , linux-mtd@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] ubifs: Allow O_DIRECT References: <1440016553-26481-1-git-send-email-richard@nod.at> <1440016553-26481-2-git-send-email-richard@nod.at> <55D542C5.6040500@cn.fujitsu.com> <1440070300.31419.202.camel@gmail.com> <55D5BC92.8050903@nod.at> <20150820204933.GG74600@google.com> <1440400405.15510.29.camel@gmail.com> <20150824161837.GA28975@localhost> <20150824234611.GV3902@dastard> X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 X-PCLoadLetter: What the f**k does that mean? Date: Tue, 25 Aug 2015 10:00:58 -0400 In-Reply-To: <20150824234611.GV3902@dastard> (Dave Chinner's message of "Tue, 25 Aug 2015 09:46:11 +1000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2955 Lines: 73 Dave Chinner writes: > On Mon, Aug 24, 2015 at 01:19:24PM -0400, Jeff Moyer wrote: >> Brian Norris writes: >> >> > On Mon, Aug 24, 2015 at 10:13:25AM +0300, Artem Bityutskiy wrote: >> >> Now, some user-space fails when direct I/O is not supported. >> > >> > I think the whole argument rested on what it means when "some user space >> > fails"; apparently that "user space" is just a test suite (which >> > can/should be fixed). >> >> Even if it wasn't a test suite it should still fail. Either the fs >> supports O_DIRECT or it doesn't. Right now, the only way an application >> can figure this out is to try an open and see if it fails. Don't break >> that. > > Who cares how a filesystem implements O_DIRECT as long as it does > not corrupt data? ext3 fell back to buffered IO in many situations, > yet the only complaints about that were performance. IOWs, it's long been > true that if the user cares about O_DIRECT *performance* then they > have to be careful about their choice of filesystem. > But if it's only 5 lines of code per filesystem to support O_DIRECT > *correctly* via buffered IO, then exactly why should userspace have > to jump through hoops to explicitly handle open(O_DIRECT) failure? > Especially when you consider that all they can do is fall back to > buffered IO themselves.... I had written counterpoints for all of this, but I thought better of it. Old versions of the kernel simply ignore O_DIRECT, so clearly there's precedent. I do think we should at least document what file systems appear to be doing. Here's a man page patch for open (generated with extra context for easier reading). Let me know what you think. Cheers, Jeff p.s. I still think it's the wrong way to go, as it makes it harder for an admin to determine what is actually going on. diff --git a/man2/open.2 b/man2/open.2 index 06c0a29..acc438b 100644 --- a/man2/open.2 +++ b/man2/open.2 @@ -1471,17 +1471,18 @@ a flag of the same name, but without alignment restrictions. .LP .B O_DIRECT support was added under Linux in kernel version 2.4.10. Older Linux kernels simply ignore this flag. Some filesystems may not implement the flag and .BR open () will fail with .B EINVAL -if it is used. +if it is used. Other file systems may implement O_DIRECT via +buffered I/O, which is essentially the same as ignoring the flag. .LP Applications should avoid mixing .B O_DIRECT and normal I/O to the same file, and especially to overlapping byte regions in the same file. Even when the filesystem correctly handles the coherency issues in this situation, overall I/O throughput is likely to be slower than using either mode alone. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/