Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754811AbYH0Vz1 (ORCPT ); Wed, 27 Aug 2008 17:55:27 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752586AbYH0VzP (ORCPT ); Wed, 27 Aug 2008 17:55:15 -0400 Received: from mail.lang.hm ([64.81.33.126]:55571 "EHLO bifrost.lang.hm" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752546AbYH0VzO (ORCPT ); Wed, 27 Aug 2008 17:55:14 -0400 Date: Wed, 27 Aug 2008 14:54:28 -0700 (PDT) From: david@lang.hm X-X-Sender: dlang@asgard.lang.hm To: Dave Chinner cc: Jamie Lokier , Nick Piggin , gus3 , Szabolcs Szakacsits , Andrew Morton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: XFS vs Elevators (was Re: [PATCH RFC] nilfs2: continuous snapshotting file system) In-Reply-To: <20080827012013.GC5706@disturbed> Message-ID: References: <20080821051508.GB5706@disturbed> <200808211933.34565.nickpiggin@yahoo.com.au> <20080821170854.GJ5706@disturbed> <200808221229.11069.nickpiggin@yahoo.com.au> <20080825015922.GP5706@disturbed> <20080825120146.GC20960@shareable.org> <20080826030759.GY5706@disturbed> <20080827012013.GC5706@disturbed> User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2960 Lines: 71 On Wed, 27 Aug 2008, Dave Chinner wrote: > On Mon, Aug 25, 2008 at 08:50:14PM -0700, david@lang.hm wrote: >> it sounds as if the various flag definitions have been evolving, would it >> be worthwhile to sep back and try to get the various filesystem folks to >> brainstorm together on what types of hints they would _like_ to see >> supported? > > Three types: > > 1. immediate dispatch - merge first with adjacent requests > then dispatch > 2. delayed dispatch - queue for a short while to allow > merging of requests from above > 3. bulk data - queue and merge. dispatch is completely > controlled by the elevator does this list change if you consider the fact that there may be a raid array or some more complex structure for the block device instead of a simple single disk partition? since I am suggesting re-thinking the filesystem <-> elevator interface, is there anything you need to have the elevator tell the filesystem? (I'm thinking that this may be the path for the filesystem to learn things about the block device that's under it, is it a raid array, a solid-state drive, etc) David Lang > Basically most metadata and log writes would fall into category 2, > which every logbufs/2 log writes or every log force using a category > 1 to prevent log I/O from being stalled too long by other I/O. > > Data writes from the filesystem would appear as category 3 (read and write) > and are subject to the specific elevator scheduling. That is, things > like the CFQ ionice throttling would work on the bulk data queue, > but not the other queues that the filesystem is using for metadata. > > Tagging the I/O as a sync I/O can still be done, but that only > affects category 3 scheduling - category 1 or 2 would do the same > thing whether sync or async.... > >> it sounds like you are using 'sync' for things where you really should be >> saying 'metadata' (or 'journal contents'), it's happened to work well >> enough in the past, but it's forcing you to keep tweaking the >> filesystems. > > Right, because there was no 'metadata' tagging, and 'sync' happened > to do exactly what we needed on all elevators at the time. > >> it may be better to try and define things from the >> filesystem point of view and let the elevators do the tweaking. >> >> basicly I'm proposing a complete rethink of the filesyste <-> elevator >> interface. > > Yeah, I've been saying that for a while w.r.t. the filesystem/block > layer interfaces, esp. now with discard requests, data integrity, > device alignment information, barriers, etc being exposed by the > layers below the filesystem, but with no interface for filesystems > to be able to access that information... > > Cheers, > > Dave. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/