Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Fri, 1 Feb 2002 02:08:58 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Fri, 1 Feb 2002 02:08:48 -0500 Received: from mx2.elte.hu ([157.181.151.9]:2536 "HELO mx2.elte.hu") by vger.kernel.org with SMTP id ; Fri, 1 Feb 2002 02:08:33 -0500 Date: Fri, 1 Feb 2002 10:04:50 +0100 (CET) From: Ingo Molnar Reply-To: To: Anton Blanchard Cc: Linus Torvalds , Andrea Arcangeli , Rik van Riel , Momchil Velikov , John Stoffel , linux-kernel Subject: Re: [PATCH] Radix-tree pagecache for 2.5 In-Reply-To: <20020131231242.GA4138@krispykreme> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 1 Feb 2002, Anton Blanchard wrote: > There were a few solutions (from davem and ingo) to allocate a larger > hash but with the radix patch we no longer have to worry about this. there is one big issue we forgot to consider. in the case of radix trees it's not only search depth that gets worse with big files. The thing i'm worried about is the 'big pagecache lock' being reintroduced again. If eg. a database application puts lots of data into a single file (multiple gigabytes - why not), then the mapping->i_shared_lock becomes a 'big pagecache lock' again, causing serious SMP contention for even the read() case. Benchmarks show that it's the distribution of locks that matters on big boxes. dbench hides this issue, because it uses many temporary files, so the locking overhead is distributed. Would you be willing to run benchmarks that measure the scalability of reading from one bigger file, from multiple CPUs? with hash based locking, the locking overhead is *always* distributed. with radix trees the locking overhead is distributed only if multiple files are used. With one big file (or a few big files), the i_shared_lock will always bounce between CPUs wildly in read() workloads, degrading scalability just as much as it is degraded with the pagecache_lock now. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/