Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757529AbZCaRD3 (ORCPT ); Tue, 31 Mar 2009 13:03:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754182AbZCaRDS (ORCPT ); Tue, 31 Mar 2009 13:03:18 -0400 Received: from earthlight.etchedpixels.co.uk ([81.2.110.250]:48141 "EHLO the-village.bc.nu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754785AbZCaRDR (ORCPT ); Tue, 31 Mar 2009 13:03:17 -0400 Date: Tue, 31 Mar 2009 17:29:11 +0100 From: Alan Cox To: Linus Torvalds Cc: Ric Wheeler , Jens Axboe , Fernando Luis =?ISO-8859-14?B?VuF6cXVleg==?= Cao , Jeff Garzik , Christoph Hellwig , Theodore Tso , Ingo Molnar , Arjan van de Ven , Andrew Morton , Peter Zijlstra , Nick Piggin , David Rees , Jesper Krogh , Linux Kernel Mailing List , chris.mason@oracle.com, david@fromorbit.com, tj@kernel.org Subject: Re: [PATCH 1/7] block: Add block_flush_device() Message-ID: <20090331172911.1897d158@the-village.bc.nu> In-Reply-To: References: <49D02328.7060108@oss.ntt.co.jp> <49D0258A.9020306@garzik.org> <49D03377.1040909@oss.ntt.co.jp> <49D0B535.2010106@oss.ntt.co.jp> <49D0B687.1030407@oss.ntt.co.jp> <20090330175544.GX5178@kernel.dk> <20090330185414.GZ5178@kernel.dk> <20090330201732.GB5178@kernel.dk> <49D17CA2.5060105@redhat.com> <49D1FB64.8000505@redhat.com> X-Mailer: Claws Mail 3.7.0 (GTK+ 2.12.12; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1742 Lines: 34 > And the filesystem shouldn't know, and it most definitely mustr not act > any differently. Because that's behind the abstraction, and there's no > sane way to bring it _out_ of the abstraction that isn't fundamentally > flawed (like thinking that it's always a SATA-II drive). How the file system responds has to depend upon what the users intents are with regards to still having their data. In a lot of cases "flush if you can" makes good sense. In higher integrity cases you want a way to tell the device "flush if you can, do whatever else is needed to fake a flush if not" and in some cases you genuinely want to propogate errors back at mount time to say "sorry can't do this" Agreed entirely that this shouldn't be expressed down the stack in terms of things like 'tags' or 'write with fua', but unless the different versions of it can be expressed, or refused you can't build a good enough abstraction. Throw and pray the block layer can fake it simply isn't a valid model for serious enterprise computing, and if people understood the worst cases, for a lot of non enterprise computing. The second problem is who has sufficient information to efficiently handle decisions around ordering/barriers/flushes/single outstanding command and other strategies. I am skeptical that in the case where the underlying block subsystem provides suboptimal ordering/barrier facilities that it falling back to alternatives without letting the fs also change strategies will be efficient. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/