Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758750AbYLDW2S (ORCPT ); Thu, 4 Dec 2008 17:28:18 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756102AbYLDW17 (ORCPT ); Thu, 4 Dec 2008 17:27:59 -0500 Received: from smtp-out.google.com ([216.239.45.13]:38331 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754906AbYLDW16 convert rfc822-to-8bit (ORCPT ); Thu, 4 Dec 2008 17:27:58 -0500 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:date:message-id:subject:from:to: cc:content-type:content-transfer-encoding; b=rXPOHOx1Vum5bkx8bpChwpgWCO2yhI4RR2GofWHuo9RI87/I4R2+eQvwkKMsfa9DP qlFR53X+RBHBrkJ0LRv9g== MIME-Version: 1.0 In-Reply-To: <492E97FA.5000804@gmail.com> References: <604427e00811212247k1fe6b63u9efe8cfe37bddfb5@mail.gmail.com> <20081126123246.GB23649@wotan.suse.de> <492DAA24.8040100@google.com> <20081127085554.GD28285@wotan.suse.de> <492E6849.6090205@google.com> <492E8708.4060601@gmail.com> <20081127120330.GM28285@wotan.suse.de> <492E90BC.1090208@gmail.com> <20081127123926.GN28285@wotan.suse.de> <492E97FA.5000804@gmail.com> Date: Thu, 4 Dec 2008 14:27:54 -0800 Message-ID: <604427e00812041427j7f1c8118p48b1b5b577143703@mail.gmail.com> Subject: Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY From: Ying Han To: =?ISO-8859-1?Q?T=F6r=F6k_Edwin?= Cc: Nick Piggin , Mike Waychison , Ingo Molnar , linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm , David Rientjes , Rohit Seth , Hugh Dickins , Peter Zijlstra , "H. Peter Anvin" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4553 Lines: 115 I am trying your test program(scalability) in house, but somehow i got different result as you saw. i created 8 files each with 1G size on separate drives( to avoid the latency disturbing of disk seek). I got this number without applying the batch based on 2.6.26. May i ask how to reproduce the mmap issue you mentioned? 8 CPU read_worker 1 threads Real time: 101.058262 s (since task start) 2 threads Real time: 50.670456 s (since task start) 4 threads Real time: 25.904657 s (since task start) 8 threads Real time: 20.090677 s (since task start) -------------------------------------------------------------------------------- mmap_worker 1 threads Real time: 101.340662 s (since task start) 2 threads Real time: 51.484646 s (since task start) 4 threads Real time: 28.414534 s (since task start) 8 threads Real time: 21.785818 s (since task start) --Ying On Thu, Nov 27, 2008 at 4:52 AM, T?r?k Edwin wrote: > On 2008-11-27 14:39, Nick Piggin wrote: >> On Thu, Nov 27, 2008 at 02:21:16PM +0200, T?r?k Edwin wrote: >> >>> On 2008-11-27 14:03, Nick Piggin wrote: >>> >>>>> Running my testcase shows no significant performance difference. What am >>>>> I doing wrong? >>>>> >>>>> >>>> >>>> Software may just be doing a lot of mmap/munmap activity. threads + >>>> mmap is never going to be pretty because it is always going to involve >>>> broadcasting tlb flushes to other cores... Software writers shouldn't >>>> be scared of using processes (possibly with some shared memory). >>>> >>>> >>> It would be interesting to compare the performance of a threaded clamd, >>> and of a clamd that uses multiple processes. >>> Distributing tasks will be a bit more tricky, since it would need to use >>> IPC, instead of mutexes and condition variables. >>> >> >> Yes, although you could use PTHREAD_PROCESS_SHARED pthread mutexes on >> the shared memory I believe (having never tried it myself). >> >> >> >>>> Actually, a lot of things get faster (like malloc, or file descriptor >>>> operations) because locks aren't needed. >>>> >>>> Despite common perception, processes are actually much *faster* than >>>> threads when doing common operations like these. They are slightly slower >>>> sometimes with things like creation and exit, or context switching, but >>>> if you're doing huge numbers of those operations, then it is unlikely >>>> to be a performance critical app... :) >>>> >>>> >>> How about distributing tasks to a set of worked threads, is the overhead >>> of using IPC instead of >>> mutexes/cond variables acceptable? >>> >> >> It is really going to depend on a lot of things. What is involved in >> distributing tasks, how many cores and cache/TLB architecture of the >> system running on, etc. >> >> You want to distribute as much work as possible while touching as >> little memory as possible, in general. >> >> But if you're distributing threads over cores, and shared caches are >> physically tagged (which I think all x86 CPUs are), then you should >> be able to have multiple processes operate on shared memory just as >> efficiently as multiple threads I think. >> >> And then you also get the advantages of reduced contention on other >> shared locks and resources. >> > > Thanks for the tips, but lets get back to the original question: > why don't I see any performance improvement with the fault-retry patches? > > My testcase only compares reads file with mmap, vs. reading files with > read, with different number of threads. > Leaving aside other reasons why mmap is slower, there should be some > speedup by running 4 threads vs 1 thread, but: > > 1 thread: read:27,18 28.76 > 1 thread: mmap: 25.45, 25.24 > 2 thread: read: 16.03, 15.66 > 2 thread: mmap: 22.20, 20.99 > 4 thread: read: 9.15, 9.12 > 4 thread: mmap: 20.38, 20.47 > > The speed of 4 threads is about the same as for 2 threads with mmap, yet > with read it scales nicely. > And the patch doesn't seem to improve scalability. > How can I find out if the patch works as expected? [i.e. verify that > faults are actually retried, and that they don't keep the semaphore locked] > >> OK, I'll see if I can find them (am overseas at the moment, and I suspect >> they are stranded on some stationary rust back home, but I might be able >> to find them on the web). > > Ok. > > Best regards, > --Edwin > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/