Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753432AbbG2LVw (ORCPT ); Wed, 29 Jul 2015 07:21:52 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:33479 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751794AbbG2LVu (ORCPT ); Wed, 29 Jul 2015 07:21:50 -0400 MIME-Version: 1.0 In-Reply-To: <20150729084102.GB16638@dastard> References: <1437061068-26118-1-git-send-email-ming.lei@canonical.com> <1437061068-26118-5-git-send-email-ming.lei@canonical.com> <20150727084020.GA28336@infradead.org> <20150727094530.GA15507@infradead.org> <20150727173331.GA17594@infradead.org> <20150729084102.GB16638@dastard> Date: Wed, 29 Jul 2015 07:21:47 -0400 Message-ID: Subject: Re: [PATCH v7 4/6] block: loop: prepare for supporing direct IO From: Ming Lei To: Dave Chinner Cc: Christoph Hellwig , Jens Axboe , Linux Kernel Mailing List , "Justin M. Forbes" , Jeff Moyer , Tejun Heo , linux-api@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3119 Lines: 75 On Wed, Jul 29, 2015 at 4:41 AM, Dave Chinner wrote: > On Wed, Jul 29, 2015 at 03:33:52AM -0400, Ming Lei wrote: >> On Mon, Jul 27, 2015 at 1:33 PM, Christoph Hellwig wrote: >> > On Mon, Jul 27, 2015 at 05:53:33AM -0400, Ming Lei wrote: >> >> Because size has to be 4k aligned too. >> > >> > Yes. But again I don't see any reason to limit us to a hardcoded 512 >> > byte block size here, especially considering the patches to finally >> >> From loop block's view, the request size can be any count of 512-byte >> sectors, then the transfer size to backing device can't guarantee to be >> 4k aligned always. > > In theory, yes. In practise, doesn't happen very often. > >> > allow enabling other block sizes from userspace. >> >> I have some questions about the patchset, and looks the author doesn't >> reply it yet. >> >> On Mon, Jul 27, 2015 at 6:06 PM, Dave Chinner wrote: >> >> Because size has to be 4k aligned too. >> > >> > So check that, too. Any >= 4k block size filesystem should be doing >> > mostly 4k aligned and sized IO... >> >> I guess you mean we only use direct IO for the 4k aligned and sized IO? >> If so, that won't be efficient because the page cache has to be flushed >> during the switch. > > It will be extremely rare for a 4k block size filesystem to do > anything other than 4k aligned and sized IO. Think about it for a > minute: what does the page cache do to unaligned IO patterns (i.e. > buffered IO)? It does IO in page sizes, and so if the application > if doing badly aligned or sized IO with buffered IO, then the > underlying device will only ever size page sized and aligned IO. > > Hence sector aligned IO will only come from applications doing > direct IO. If the application is doing direct IO and it's not > properly aligned, then it already is going to get sucky performance > because most filesystem serialise sub-block size direct IO because > concurrent sub-block IOs to the same block usually leads to data > corruption. The blocksize of filesysten over loop can be 512, 1024, 2048, and suppose sector size of backing device is 4096, then filesystem can see aligned direct IO when IO size/offset from application is aligned with fs block size, but loop still can't do direct IO for all this kind of requests against backing file. Another case is that application may access loop block directly, such as 'dd if=/dev/loopN', but it may not be common, and maybe it needn't to consider. Thanks, > > So, really, sector aligned/sized direct IO is a sucky performance > path before we even get to the loop device, so we don't really need > to care how fast the loop device handles this case. The loop device > just needs to ensure that it doesn't corrupt data when badly aligned > IOs come in... ;) > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/