Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752245AbdFNPCo (ORCPT ); Wed, 14 Jun 2017 11:02:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:40514 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751836AbdFNPCn (ORCPT ); Wed, 14 Jun 2017 11:02:43 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6C8B222CB4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=goodmis.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=rostedt@goodmis.org Date: Wed, 14 Jun 2017 11:02:40 -0400 From: Steven Rostedt To: "Paul E. McKenney" Cc: Peter Zijlstra , Krister Johansen , Ingo Molnar , linux-kernel@vger.kernel.org, Paul Gortmaker , Thomas Gleixner Subject: Re: [PATCH tip/sched/core] Add comments to aid in safer usage of swake_up. Message-ID: <20170614110240.10abe2ed@gandalf.local.home> In-Reply-To: <20170614091015.01d7dc89@gandalf.local.home> References: <20170609032546.GF2553@templeofstupid.com> <20170609071957.GJ8337@worktop.programming.kicks-ass.net> <20170609124554.GF3721@linux.vnet.ibm.com> <20170613192308.173dd86a@gandalf.local.home> <20170613234205.GD3721@linux.vnet.ibm.com> <20170613211547.49814d25@gandalf.local.home> <20170614035843.GI3721@linux.vnet.ibm.com> <20170614091015.01d7dc89@gandalf.local.home> X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1869 Lines: 94 On Wed, 14 Jun 2017 09:10:15 -0400 Steven Rostedt wrote: > Now let's make it simpler. I'll even add the READ_ONCE and WRITE_ONCE > where applicable. > > > CPU0 CPU1 > ---- ---- > LOCK(A) > > LOCK(B) > WRITE_ONCE(X, INIT) > > (the cpu may postpone writing X) > > (the cpu can fetch wq list here) > list_add(wq, q) > > UNLOCK(B) > > (the cpu may fetch old value of X) > > (write of X happens here) > > if (READ_ONCE(X) != init) > schedule(); > > UNLOCK(A) > > if (list_empty(wq)) > return; > > Tell me again how the READ_ONCE() and WRITE_ONCE() helps in this > scenario? > > Because we are using spinlocks, this wont be an issue for most > architectures. The bug happens if the fetching of the list_empty() > leaks into before the UNLOCK(A). > > If the reading/writing of the list and the reading/writing of gp_flags > gets reversed in either direction by the CPU, then we have a problem. FYI.. Both sides need a memory barrier. Otherwise, even with a memory barrier on CPU1 we can still have: CPU0 CPU1 ---- ---- LOCK(A) LOCK(B) list_add(wq, q) (cpu waits to write wq list) (cpu fetches X) WRITE_ONCE(X, INIT) UNLOCK(A) smp_mb(); if (list_empty(wq)) return; (cpu writes wq list) UNLOCK(B) if (READ_ONCE(X) != INIT) schedule() Luckily for us, there is a memory barrier on CPU0. In prepare_to_swait() we have: raw_spin_lock_irqsave(&q->lock, flags); __prepare_to_swait(q, wait); set_current_state(state); raw_spin_unlock_irqrestore(&q->lock, flags); And that set_current_state() call includes a memory barrier, which will prevent the above from happening, as the addition to the wq list must be flushed before fetching X. I still strongly believe that the swait_active() requires a memory barrier. -- Steve