Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966611AbbFJQFv (ORCPT ); Wed, 10 Jun 2015 12:05:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38010 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965264AbbFJQFr (ORCPT ); Wed, 10 Jun 2015 12:05:47 -0400 Date: Wed, 10 Jun 2015 18:04:44 +0200 From: Oleg Nesterov To: Kirill Tkhai Cc: Peter Zijlstra , umgwanakikbuti@gmail.com, mingo@elte.hu, ktkhai@parallels.com, rostedt@goodmis.org, tglx@linutronix.de, juri.lelli@gmail.com, pang.xunlei@linaro.org, wanpeng.li@linux.intel.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH 08/14] hrtimer: Allow hrtimer::function() to free the timer Message-ID: <20150610160444.GB3341@redhat.com> References: <20150605084836.364306429@infradead.org> <20150605085205.723058588@infradead.org> <20150607223317.GA5193@redhat.com> <20150608091417.GM19282@twins.programming.kicks-ass.net> <20150608124234.GW18673@twins.programming.kicks-ass.net> <20150609213318.GA12436@redhat.com> <1433922411.23588.132.camel@odin.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1433922411.23588.132.camel@odin.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2558 Lines: 122 Hi Kirill, On 06/10, Kirill Tkhai wrote: > > В Вт, 09/06/2015 в 23:33 +0200, Oleg Nesterov пишет: > > > > hrtimer_active(timer) > > { > > > > do { > > base = READ_ONCE(timer->base->cpu_base); > > seq = read_seqcount_begin(&cpu_base->seq); > > > > if (timer->state & ENQUEUED || > > base->running == timer) > > return true; > > > > } while (read_seqcount_retry(&cpu_base->seq, seq) || > > base != READ_ONCE(timer->base->cpu_base)); > > > > return false; > > } > > > > And we need to avoid the races with 2 transitions in __run_hrtimer(). > > > > The first race is trivial, we change __run_hrtimer() to do > > > > write_seqcount_begin(cpu_base->seq); > > cpu_base->running = timer; > > __remove_hrtimer(timer); // clears ENQUEUED > > write_seqcount_end(cpu_base->seq); > > We use seqcount, because we are afraid that hrtimer_active() may miss > timer->state or cpu_base->running, when we are clearing it. Yes, > If we use two pairs of write_seqcount_{begin,end} in __run_hrtimer(), > we may protect only the places where we do that: > > cpu_base->running = timer; > write_seqcount_begin(cpu_base->seq); > __remove_hrtimer(timer); // clears ENQUEUED > write_seqcount_end(cpu_base->seq); > > .... > > timer->state |= HRTIMER_STATE_ENQUEUED; > write_seqcount_begin(cpu_base->seq); > base->running = NULL; > write_seqcount_end(cpu_base->seq); Afaics, no. Afaics, the following code is correct: seqcount_t LOCK; bool X = true, Y = false; void read(void) { bool x, y; do { seq = read_seqcount_begin(&LOCK); x = X; y = Y; } while (read_seqcount_retry(&LOCK, seq)); BUG_ON(!x && !y); } void write(void) { Y = true; write_seqcount_begin(LOCK); write_seqcount_end(LOCK); X = false; } If we rely on the "locking" semantics of seqcount_t, this doesn't really differ from spinlock_t LOCK; bool X = true, Y = false; void read(void) { bool x, y; spin_lock(LOCK); x = X; y = Y; spin_unlock(LOCK); BUG_ON(!x && !y); } void write(void) { Y = true; spin_lock(LOCK); spin_unlock(LOCK); X = false; } If "read" takes the lock before "write", it must see X == true. Otherwise "read" should see all memory changes done before or inside the "write" critical section, so it must see Y == true. No? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/