Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756260AbbBCULG (ORCPT ); Tue, 3 Feb 2015 15:11:06 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53977 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755017AbbBCULD (ORCPT ); Tue, 3 Feb 2015 15:11:03 -0500 Date: Tue, 3 Feb 2015 21:09:16 +0100 From: Oleg Nesterov To: Peter Zijlstra Cc: Darren Hart , Thomas Gleixner , Jerome Marchand , Larry Woodman , Mateusz Guzik , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/1] futex: check PF_KTHREAD rather than !p->mm to filter out kthreads Message-ID: <20150203200916.GA10545@redhat.com> References: <20150202140515.GA26398@redhat.com> <20150202151159.GE26304@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150202151159.GE26304@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3832 Lines: 155 Peter, I am getting more confused when I re-read your email today ;) see below. Btw, do you agree with 1/1? Can you ack/nack it? On 02/02, Peter Zijlstra wrote: > > On Mon, Feb 02, 2015 at 03:05:15PM +0100, Oleg Nesterov wrote: > > > And another question. Lets forget about this ->mm check. I simply can not > > understand this > > > > ret = (p->flags & PF_EXITPIDONE) ? -ESRCH : -EAGAIN > > > > I must have missed something but this looks buggy, I do not see any > > preemption point in this "retry" loop. Suppose that max_cpus=1 and rt_task() > > preempts the non-rt PF_EXITING owner. Looks like futex_lock_pi() can spin > > forever in this case? (OK, ignoring RT throttling). > > So yes, I do like your proposal of putting PF_EXITPIDONE under the > ->pi_lock section that handles exit_pi_state_list(). Probably I was not clear... Let try again just in case. I believe that the whole "spin waiting for PF_EXITING -> PF_EXITPIDONE transition" idea is simply wrong. See the test-case I sent. I think that attach_to_pi_owner() should never check PF_EXITING and never return -EAGAIN. It should either proceed and add pi_state to the list or return -ESRCH if exit_pi_state_list() was called. Do you agree? Perhaps we can set PF_EXITPIDONE lockless and avoid the unconditional lock(pi_lock) but this is minor. The main problem is that I fail to understand why this logic was added in the first place... To avoid the race with exit_robust_list() ? I do not see why this is needed... > As for the recursive fault; I think the safer option is to set > EXITPIDONE and not register more PI states, as opposed to allowing more > and more states to be added. Yes we'll leak whatever currently is there, > but no point in allowing it to get worse. Not sure I understand... If you mean recursive do_exit() then yes, I think that we should simply set EXITPIDONE lockless in a best-effort manner, this is what the current code does. Just the comment should be updated in any case imo. But mostly I was confused by the pseudo-code below. Heh, because I thought that it describes the changes in kernel/futex.c you think we should do. Now that I finally realized that it outlines the current code I am unconfused a bit ;) Oleg. > do_exit() > { > exit_signals(tsk); /* sets PF_EXITING */ > > smp_mb(); > raw_spin_unlock_wait(&tsk->pi_lock); > > exit_mm() { > mm_release() { > exit_pi_state_list(); > } > } > > tsk->flags |= PF_EXITPIDONE; > } > > vs > > futex_lock_pi() > { > retry: > ... > > ret = futex_lock_pi_atomic() { > attach_to_pi_owner() { > raw_spin_lock(&tsk->pi_lock); > if (PF_EXITING) { > ret = PF_EXITPIDONE ? -ESRCH : -AGAIN; > raw_spin_unlock(&tsk->pi_lock); > return ret; > } > } > } > if (ret) { > switch(ret) { > ... > > case -EAGAIN: > ... > cond_resched(); > goto retry; > } > } > } > > vs > > futex_requeue() > { > retry: > ... > > ret = futex_proxy_trylock_atomic() { > ret = futex_lock_pi_atomic() { > attach_to_pi_owner() { > raw_spin_lock(&tsk->pi_lock); > if (PF_EXITING) { > ret = PF_EXITPIDONE ? -ESRCH : -AGAIN; > raw_spin_unlock(&tsk->pi_lock); > return ret; > } > } > } > } > > if (ret > 0) { > ret = lookup_pi_state() { > attach_to_pi_owner() { > raw_spin_lock(&tsk->pi_lock); > if (PF_EXITING) { > ret = PF_EXITPIDONE ? -ESRCH : -AGAIN; > raw_spin_unlock(&tsk->pi_lock); > return ret; > } > } > } > } > > ... > switch(ret) { > ... > case -EAGAIN: > ... > cond_resched(); > goto retry; > } > } > > vs > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/