Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751800Ab3HSXbx (ORCPT ); Mon, 19 Aug 2013 19:31:53 -0400 Received: from mail-ve0-f171.google.com ([209.85.128.171]:62450 "EHLO mail-ve0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751656Ab3HSXbu (ORCPT ); Mon, 19 Aug 2013 19:31:50 -0400 MIME-Version: 1.0 In-Reply-To: References: <520BED7A.4000903@intel.com> <20130814230648.GD22316@thunk.org> <20130815011101.GA3572@thunk.org> <20130815021028.GM6023@dastard> <20130815060149.GP6023@dastard> <20130815071141.GQ6023@dastard> <20130815074531.GA2147@quack.suse.cz> <20130815212826.GS6023@dastard> From: Andy Lutomirski Date: Mon, 19 Aug 2013 16:31:30 -0700 Message-ID: Subject: Re: page fault scalability (ext3, ext4, xfs) To: David Lang Cc: Dave Chinner , Jan Kara , "Theodore Ts'o" , Dave Hansen , Dave Hansen , Linux FS Devel , xfs@oss.sgi.com, "linux-ext4@vger.kernel.org" , LKML , Tim Chen , Andi Kleen Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1934 Lines: 46 On Mon, Aug 19, 2013 at 4:23 PM, David Lang wrote: > On Fri, 16 Aug 2013, Dave Chinner wrote: > >> The problem with "not exported, don't update" is that files can be >> modified on server startup (e.g. after a crash) or in short >> maintenance periods when the NFS service is down. When the server is >> started back up, the change number needs to indicate the file has >> been modified so that clients reconnecting to the server see the >> change. >> >> IOWs, even if the NFS server is not up or the filesystem not >> exported we still need to update change counts whenever a file >> changes if we are going to tell the NFS server that we keep them... > > > This sounds like you need something more like relctime rather than noctime, > something that updates the time in ram, but doesn't insist on flushing it to > disk immediatly, updating when convienient or when the file is closed. > > David Lang I guess my patches could be extended to do this. In their current form, when a pte dirty bit is transferred to a page (via page_mkclean or unmap), the address_space is marked as needed a cmtime update. I could add a mode in which even the normal write syscall path sets that bit instead of immediately updating the timestamp. This could be a nice speedup to non-mmap writers. To avoid breaking things, things like fsync would need to force a cmtime flush -- I doubt it would be okay for write; fsync; write; fsync to leave the timestamp matching the first write. I'd rather get comments on the current form of my patches and maybe get them merged before looking at even more far-reaching extensions, though. --Andy -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/