Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752691AbbG2WKO (ORCPT ); Wed, 29 Jul 2015 18:10:14 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:6373 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750803AbbG2WKL (ORCPT ); Wed, 29 Jul 2015 18:10:11 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2A2DACOTrlVPG0mLHlbgxqBPYJVg3yiMAEBAQEBAQaafAICAQECgVdNAQEBAQEBBwEBAQFAAT+EIwEBAQMBOhwjBQsIAxgJJQ8FJQMHGgoJiCYHz2QBAQEHAgEfGYYGhS+EL1gHhCwFlHCMQpkygQqDKywxgQaBRgEBAQ Date: Thu, 30 Jul 2015 08:08:29 +1000 From: Dave Chinner To: Ming Lei Cc: Christoph Hellwig , Jens Axboe , Linux Kernel Mailing List , "Justin M. Forbes" , Jeff Moyer , Tejun Heo , linux-api@vger.kernel.org Subject: Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO Message-ID: <20150729220829.GM3902@dastard> References: <1437061068-26118-1-git-send-email-ming.lei@canonical.com> <1437061068-26118-5-git-send-email-ming.lei@canonical.com> <20150727084020.GA28336@infradead.org> <20150727094530.GA15507@infradead.org> <20150727173331.GA17594@infradead.org> <20150729084102.GB16638@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3367 Lines: 76 On Wed, Jul 29, 2015 at 07:21:47AM -0400, Ming Lei wrote: > On Wed, Jul 29, 2015 at 4:41 AM, Dave Chinner wrote: > > On Wed, Jul 29, 2015 at 03:33:52AM -0400, Ming Lei wrote: > >> On Mon, Jul 27, 2015 at 1:33 PM, Christoph Hellwig wrote: > >> > On Mon, Jul 27, 2015 at 05:53:33AM -0400, Ming Lei wrote: > >> >> Because size has to be 4k aligned too. > >> > > >> > Yes. But again I don't see any reason to limit us to a hardcoded 512 > >> > byte block size here, especially considering the patches to finally > >> > >> From loop block's view, the request size can be any count of 512-byte > >> sectors, then the transfer size to backing device can't guarantee to be > >> 4k aligned always. > > > > In theory, yes. In practise, doesn't happen very often. > > > >> > allow enabling other block sizes from userspace. > >> > >> I have some questions about the patchset, and looks the author doesn't > >> reply it yet. > >> > >> On Mon, Jul 27, 2015 at 6:06 PM, Dave Chinner wrote: > >> >> Because size has to be 4k aligned too. > >> > > >> > So check that, too. Any >= 4k block size filesystem should be doing > >> > mostly 4k aligned and sized IO... > >> > >> I guess you mean we only use direct IO for the 4k aligned and sized IO? > >> If so, that won't be efficient because the page cache has to be flushed > >> during the switch. > > > > It will be extremely rare for a 4k block size filesystem to do > > anything other than 4k aligned and sized IO. Think about it for a > > minute: what does the page cache do to unaligned IO patterns (i.e. > > buffered IO)? It does IO in page sizes, and so if the application > > if doing badly aligned or sized IO with buffered IO, then the > > underlying device will only ever size page sized and aligned IO. > > > > Hence sector aligned IO will only come from applications doing > > direct IO. If the application is doing direct IO and it's not > > properly aligned, then it already is going to get sucky performance > > because most filesystem serialise sub-block size direct IO because > > concurrent sub-block IOs to the same block usually leads to data > > corruption. > > The blocksize of filesysten over loop can be 512, 1024, 2048, and > suppose sector size of backing device is 4096, then filesystem > can see aligned direct IO when IO size/offset from application is aligned > with fs block size, but loop still can't do direct IO for all this > kind of requests > against backing file. Sure, but again you're talking about a fairly rare configuration. The vast majority of filesystems use 4k block sizes, just like the vast majority of applications use buffered IO. Don't jump through hoops to optimise a case that probably doesn't need optimising. Make it work correctly first, then optimise performance later when someone has a need for it to be really fast. > Another case is that application may access loop block directly, such > as 'dd if=/dev/loopN', but it may not be common, and maybe it needn't > to consider. 'dd if=/dev/loopN bs=4k....' Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/