Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758962AbYFJDSf (ORCPT ); Mon, 9 Jun 2008 23:18:35 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754125AbYFJDS1 (ORCPT ); Mon, 9 Jun 2008 23:18:27 -0400 Received: from relay2.sgi.com ([192.48.171.30]:33570 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753940AbYFJDS0 (ORCPT ); Mon, 9 Jun 2008 23:18:26 -0400 Date: Mon, 9 Jun 2008 20:18:25 -0700 (PDT) From: Christoph Lameter X-X-Sender: clameter@schroedinger.engr.sgi.com To: Rusty Russell cc: Mike Travis , Andrew Morton , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, David Miller , Eric Dumazet , Peter Zijlstra Subject: Re: [patch 04/41] cpu ops: Core piece for generic atomic per cpu operations In-Reply-To: <200806101256.31615.rusty@rustcorp.com.au> Message-ID: References: <20080530035620.587204923@sgi.com> <200806100927.48597.rusty@rustcorp.com.au> <200806101256.31615.rusty@rustcorp.com.au> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2226 Lines: 64 On Tue, 10 Jun 2008, Rusty Russell wrote: > > Right that is what the cpu alloc patches do. So you could implement > > cpu_local_inc on top of some of the cpu alloc patches. > > Or you could just implement it today as a standalone patch. You need at least the zero basing to enable the use of the segment register on x86_64. > > But then the whole point of local_t is gone. Why not use atomic_t in the > > first place? > > Because some archs can do better. The argument does not make any sense. First you want to use atomic_t then not? > > > By definition if the caller cared, they would have had premption > > > disabled. > > > > There are numerous instances where the caller does not care about > > preemption. Its just important that one per cpu counter is increment in > > the least intrusive way. See f.e. the VM event counters. > > Yes, and that's exactly the point. The VM event counters are exactly a case > where you should have used cpu_local_inc. I tried it and did not give any benefit except first failing due to bugs because local_t did not disable preempt6... This led to Andi fixing local_t. But with the preempt disabling I could not discern what the benefit would be. CPU_INC does not require disabling of preempt and the cpu alloc patches shorten the code sequence to increment a VM counter significantly. Here is the header from the patch. How would cpu_local_inc be able to do better unless you adopt this patchset and add a shim layer? Subject: VM statistics: Use CPU ops The use of CPU ops here avoids the offset calculations that we used to have to do with per cpu operations. The result of this patch is that event counters are coded with a single instruction the following way: incq %gs:offset(%rip) Without these patches this was: mov %gs:0x8,%rdx mov %eax,0x38(%rsp) mov xxx(%rip),%eax mov %eax,0x48(%rsp) mov varoffset,%rax incq 0x110(%rax,%rdx,1) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/