Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965306AbbLOTBP (ORCPT ); Tue, 15 Dec 2015 14:01:15 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49659 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933438AbbLOTBO (ORCPT ); Tue, 15 Dec 2015 14:01:14 -0500 Date: Tue, 15 Dec 2015 20:01:32 +0100 From: Oleg Nesterov To: Paul Turner Cc: Peter Zijlstra , NeilBrown , Linus Torvalds , Thomas Gleixner , LKML , Mike Galbraith , Ingo Molnar , Peter Anvin , vladimir.murzin@arm.com, linux-tip-commits@vger.kernel.org, jstancek@redhat.com Subject: Re: [tip:locking/core] sched/wait: Fix signal handling in bit wait helpers Message-ID: <20151215190132.GB19007@redhat.com> References: <20151201130404.GL3816@twins.programming.kicks-ass.net> <20151208104712.GJ6356@twins.programming.kicks-ass.net> <87zixkph0m.fsf@notabene.neil.brown.name> <20151209074033.GF6357@twins.programming.kicks-ass.net> <87si3bpaxy.fsf@notabene.neil.brown.name> <20151210130948.GW6356@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1296 Lines: 37 On 12/11, Paul Turner wrote: > > Peter's proposed follow-up above looks strictly more correct. We need > to evaluate the potential existence of a signal, *after* we return > from schedule, I still don't understand this... signal_pending_check(current->state) before schedule() should be fine even if it actually reads current->state twice and it races with wakeup/ signal_wake_up() which can change the caller's state. > but in the context of the state which we previously > _entered_ schedule() on. Yes, but only if we do this after return from schedule(). But somehow this change helps. It adds the subtle difference(s), for example __wait_on_bit_lock() won't do another test_and_set_bit() if the sleeping caller is killed, but this shouldn't matter. And if this does matter because it has a buggy user, then it is not clear why the change from Vladimir helps too. The common part is that both changes make "return 1" impossible, but according to another email from Peter this just makes the fail less likely. I am really puzzled. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/