Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755466Ab0KMLmU (ORCPT ); Sat, 13 Nov 2010 06:42:20 -0500 Received: from mailout-de.gmx.net ([213.165.64.23]:50779 "HELO mail.gmx.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with SMTP id S1753724Ab0KMLmS (ORCPT ); Sat, 13 Nov 2010 06:42:18 -0500 X-Authenticated: #14349625 X-Provags-ID: V01U2FsdGVkX1/TxGl1V8DTwGDMygr12qEpsp6sd6faxja9uBpjBa YZ3J2zK1b4GHNv Subject: Re: [RFC/RFT PATCH v3] sched: automated per tty task groups From: Mike Galbraith To: Oleg Nesterov Cc: Linus Torvalds , Peter Zijlstra , Mathieu Desnoyers , Ingo Molnar , LKML , Markus Trippelsdorf In-Reply-To: <20101112181240.GB8659@redhat.com> References: <1287648715.9021.20.camel@marge.simson.net> <20101021105114.GA10216@Krystal> <1287660312.3488.103.camel@twins> <20101021162924.GA3225@redhat.com> <1288076838.11930.1.camel@marge.simson.net> <1288078144.7478.9.camel@marge.simson.net> <1289489200.11397.21.camel@maggy.simson.net> <20101111202703.GA16282@redhat.com> <1289514000.21413.204.camel@maggy.simson.net> <20101112181240.GB8659@redhat.com> Content-Type: text/plain; charset="UTF-8" Date: Sat, 13 Nov 2010 04:42:04 -0700 Message-ID: <1289648524.22764.149.camel@maggy.simson.net> Mime-Version: 1.0 X-Mailer: Evolution 2.30.1.2 Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3669 Lines: 99 On Fri, 2010-11-12 at 19:12 +0100, Oleg Nesterov wrote: > On 11/11, Mike Galbraith wrote: > > > > On Thu, 2010-11-11 at 21:27 +0100, Oleg Nesterov wrote: > > > > > But the real problem is that copy_process() can fail after that, > > > and in this case we have the unbalanced kref_get(). > > > > Memory leak, will fix. > > > > > > +++ linux-2.6.36.git/kernel/exit.c > > > > @@ -174,6 +174,7 @@ repeat: > > > > write_lock_irq(&tasklist_lock); > > > > tracehook_finish_release_task(p); > > > > __exit_signal(p); > > > > + sched_autogroup_exit(p); > > > > > > This doesn't look right. Note that "p" can run/sleep after that > > > (or in parallel), set_task_rq() can use the freed ->autogroup. > > > > So avoiding refcounting rcu released task_group backfired. Crud. > > Just in case, the lock order may be wrong. sched_autogroup_exit() > takes task_group_lock under write_lock(tasklist), while > sched_autogroup_handler() takes them in reverse order. Bug self destructs when global classifier goes away. > I am not sure, but perhaps this can be simpler? > wake_up_new_task() does autogroup_fork(), and do_exit() does > sched_autogroup_exit() before the last schedule. Possible? That's what I was going to do. That said, I couldn't have had the problem if I'd tied final put directly to life of container, and am thinking I should do that instead when I go back to p->signal. > Very basic question. Currently sched_autogroup_create_attach() > has the only caller, __proc_set_tty(). It is a bit strange that > signal->tty change is process-wide, but sched_autogroup_create_attach() > move the single thread, the caller. What about other threads in > this thread group? The same for proc_clear_tty(). Yeah, I really should (will) move all on the spot, though it doesn't seem to matter in general practice, forks afterward land in the right bucket. With per tty or p->signal, migration will pick up stragglers lazily.. unless they're pinned. > > +void sched_autogroup_create_attach(struct task_struct *p) > > +{ > > + autogroup_move_task(p, autogroup_create()); > > + > > + /* > > + * Correct freshly allocated group's refcount. > > + * Move takes a reference on destination, but > > + * create already initialized refcount to 1. > > + */ > > + if (p->autogroup != &autogroup_default) > > + autogroup_kref_put(p->autogroup); > > +} > > OK, but I don't understand "p->autogroup != &autogroup_default" > check. This is true if autogroup_create() succeeds. Otherwise > autogroup_create() does autogroup_kref_get(autogroup_default), > doesn't this mean we need unconditional _put ? D'oh, target fixation :) Thanks. > And can't resist, minor cosmetic nit, > > > static inline struct task_group *task_group(struct task_struct *p) > > { > > + struct task_group *tg; > > struct cgroup_subsys_state *css; > > > > css = task_subsys_state_check(p, cpu_cgroup_subsys_id, > > lockdep_is_held(&task_rq(p)->lock)); > > - return container_of(css, struct task_group, css); > > + tg = container_of(css, struct task_group, css); > > + > > + autogroup_task_group(p, &tg); > > Fell free to ignore, but imho > > return autogroup_task_group(p, tg); > > looks a bit better. Why autogroup_task_group() returns its > result via pointer? No particularly good reason, I'll do the cosmetic change. Thanks, -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/