Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751473AbWALXzh (ORCPT ); Thu, 12 Jan 2006 18:55:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751487AbWALXzh (ORCPT ); Thu, 12 Jan 2006 18:55:37 -0500 Received: from e3.ny.us.ibm.com ([32.97.182.143]:59534 "EHLO e3.ny.us.ibm.com") by vger.kernel.org with ESMTP id S1751473AbWALXzg (ORCPT ); Thu, 12 Jan 2006 18:55:36 -0500 Subject: Re: [Lse-tech] Re: [ckrm-tech] Re: [PATCH 00/01] Move Exit Connectors From: Matt Helsley To: Jes Sorensen Cc: Shailabh Nagar , Andrew Morton , Jay Lan , LKML , elsa-devel@lists.sourceforge.net, lse-tech@lists.sourceforge.net, CKRM-Tech , Paul Jackson , Erik Jacobson , Jack Steiner , John Hesterberg In-Reply-To: References: <43BB05D8.6070101@watson.ibm.com> <43BB09D4.2060209@watson.ibm.com> <43BC1C43.9020101@engr.sgi.com> <1136414431.22868.115.camel@stark> <20060104151730.77df5bf6.akpm@osdl.org> <1136486566.22868.127.camel@stark> <1136488842.22868.142.camel@stark> <20060105151016.732612fd.akpm@osdl.org> <1136505973.22868.192.camel@stark> <43BE9E91.9060302@watson.ibm.com> Content-Type: text/plain Date: Thu, 12 Jan 2006 15:49:13 -0800 Message-Id: <1137109753.6673.276.camel@stark> Mime-Version: 1.0 X-Mailer: Evolution 2.0.4 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4879 Lines: 103 On Wed, 2006-01-11 at 05:36 -0500, Jes Sorensen wrote: > >>>>> "Shailabh" == Shailabh Nagar writes: > > Shailabh> Jes Sorensen wrote: > >> I am quite concerned about that lock your patches put into struct > >> task_struct through struct task_delay_info. Have you done any > >> measurements on how this impacts performance on highly threaded > >> apps on larger system? > > Shailabh> I don't expect the lock contention to be high. The lock is > Shailabh> held for a very short time (across two > Shailabh> additions/increments). Moreover, it gets contended only when > Shailabh> the stats are being read (either through /proc or > Shailabh> connectors). Since the reading of stats won't be that > Shailabh> frequent (the utility of these numbers is to influence the > Shailabh> I/O priority/rss limit etc. which won't be done at a very > Shailabh> small granularity anyway), I wouldn't expect a problem. Not just when the stats are being read, but when the stats of that particular task (A) are being read by one task (B) AND updated by the task itself (A). > Hi Shailabh, > > When this is read through connectors, it's initiated by the connectors > code which is called from the task's context hence we don't need > locking for that. It's very similar to the task_notify code I am about > to post and I think the connector code could fit into that > framework. The main issue is /proc, but then one could even have a > mechanism with a hook when the task exits that pushes the data to a > storage point which is lock protected. Hmm, which task? The request for stats does not necessarily/usualy originate from the task we desire stats on. Hence the synchronization. In this way it's significantly different from lockless task_notify. > Even if a lock isn't contended, you are still going to see the cache > lines bounce around due to the writes. It may not show up on a 4-way > box but what happens on a 64-way? We have seen some pretty nasty > effects on the bigger SN2 boxes with locks that were taken far too > frequently, to the point where it would prevent the box from booting > (now I don't expect it to that severe here, but I'd still like to > explore an approach of doing it lock free). You're referring to the performance impact of a global lock on a large system. The lock Shailabh introduced is per-task. Those won't bounce around nearly as much -- I think they bounce under the following conditions: - The task in which the lock is embedded is in the cache of processor A and the task reading the stats of that task is on processor B. - The scheduler bounces a task between processor A and processor B. Am I missing any other circumstances under which it could bounce? > Shailabh> But its better to take some measurements anyway. Any > Shailabh> suggestions on a benchmark ? > > >> IMHO it seems to make more sense to use something like Jack's > >> proposed task_notifier code to lock-less collect the data into task > >> local data structures and then take the data from there and ship > >> off to userland through netlink or similar like you are doing? > >> > >> I am working on modifying Jack's patch to carry task local data and > >> use it for not just accounting but other areas that need optional > >> callbacks (optional in the sense that it's a feature that can be > >> enabled or disabled). Looking at Shailabh's delayacct_blkio() > >> changes it seems that it would be really easy to fit those into > >> that framework. > >> > >> Guess I should post some of this code ..... > > Shailabh> Please do. If this accounting can fit into some other > Shailabh> framework, thats fine too. > > Ok, finally, sorry for the delay. My current code snapshot is > available at http://www.trained-monkey.org/~jes/patches/task_notify/ - > it's a modified version of Jack's task_notify code, and three example > users of it (the SysV IPC semundo semaphore, the key infrastructure > and SGI's JOB module). The patch order should be task_notify.diff, > task-notify-keys.diff, task-notify-semundo.diff, and > task_notify-job.diff last. > > I think task_notify it should be usable for statistics gathering as > well, the only issue is how to attach it to the processes we wish to > gather accounting for. Personally I am not a big fan of the current > concept where statistics are gathered for all tasks at all time but > just not exported until accounting is enabled. The only potential problem I can see with using task_notify for gathering statistics is each section of code that increments stats would have to look for its notify block in the task's list. Cheers, -Matt Helsley - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/