Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966739AbWKYROo (ORCPT ); Sat, 25 Nov 2006 12:14:44 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S966747AbWKYROo (ORCPT ); Sat, 25 Nov 2006 12:14:44 -0500 Received: from host-233-54.several.ru ([213.234.233.54]:4744 "EHLO mail.screens.ru") by vger.kernel.org with ESMTP id S966739AbWKYROn (ORCPT ); Sat, 25 Nov 2006 12:14:43 -0500 Date: Sat, 25 Nov 2006 20:14:38 +0300 From: Oleg Nesterov To: Alan Stern Cc: "Paul E. McKenney" , Jens Axboe , Kernel development list Subject: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync Message-ID: <20061125171438.GA159@oleg> References: <20061124211300.GA102@oleg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3826 Lines: 172 On 11/24, Alan Stern wrote: > > On Sat, 25 Nov 2006, Oleg Nesterov wrote: > > > spin_lock() + spin_unlock() doesn't imply mb(), it allows subsequent loads > > to move into the the critical region. > > No, that's wrong. Subsequent loads are allowed to move into the region > protected by the spinlock, but not past it (into the xxx critical > section). Yes, you are right, but see below what I meant. > > I personally prefer this way, but may be you are right. > > See what you think... > > Alan > > //----------------------------------------------------------------------------- > struct xxx_struct { > int completed; > int ctr[2]; > struct mutex mutex; > spinlock_t lock; > wait_queue_head_t wq; > }; > > void init_xxx_struct(struct xxx_struct *sp) > { > sp->completed = 0; > sp->ctr[0] = 1; > sp->ctr[1] = 0; > spin_lock_init(&sp->lock); > mutex_init(&sp->mutex); > init_waitqueue_head(&sp->wq); > } > > int xxx_read_lock(struct xxx_struct *sp) > { > int idx; > > spin_lock(&sp->lock); > idx = sp->completed & 0x1; > ++sp->ctr[idx]; > spin_unlock(&sp->lock); > return idx; > } > > void xxx_read_unlock(struct xxx_struct *sp, int idx) > { > spin_lock(&sp->lock); It is possible that the memory ops that occur before spin_lock() is not yet completed, > if (--sp->ctr[idx] == 0) suppose that synchronize_xxx() just unlocked sp->lock. It sees sp->ctr[idx] == 0 and returns. > wake_up(&sp->wq); > spin_unlock(&sp->lock); This is a one-way barrier, yes. But it is too late. Actually, synchronize_xxx() may sleep on sp->wq and we still have a race. synchronize_xxx() can return before ->wake_up() unlocks sp->wq.lock (finish_wait() doesn't take sp->wq.lock due to autoremove_wake_function()). > } > > void synchronize_xxx(struct xxx_struct *sp) > { > int idx; > > mutex_lock(&sp->mutex); > > spin_lock(&sp->lock); > idx = sp->completed & 0x1; > ++sp->completed; > --sp->ctr[idx]; > sp->ctr[idx ^ 1] = 1; > spin_unlock(&sp->lock); > > wait_event(sp->wq, sp->ctr[idx] == 0); > mutex_unlock(&sp->mutex); > } This is more or less equivalent to void synchronize_xxx(struct xxx_struct *sp) { int idx; mutex_lock(&sp->mutex); idx = sp->completed & 0x1; atomic_dec(sp->ctr + idx); smp_mb__before_atomic_inc(); atomic_inc(sp->ctr + (idx ^ 0x1)); sp->completed++; wait_event(sp->wq, !atomic_read(sp->ctr + idx)); mutex_unlock(&sp->mutex); } and lacks an optimization. void synchronize_xxx(struct xxx_struct *sp) { int idx; mutex_lock(&sp->mutex); spin_lock(&sp->lock); idx = sp->completed & 0x1; if (sp->ctr[idx] == 1) { spin_unlock(&sp->lock); goto out; } ++sp->completed; --sp->ctr[idx]; sp->ctr[idx ^ 1] = 1; spin_unlock(&sp->lock); wait_event(sp->wq, sp->ctr[idx] == 0); out: mutex_unlock(&sp->mutex); } Honestly, I don't see why it is better, but may be this is just me. In any case, spinlock based implementation shouldn't be faster, yes? Jens, Paul, what do you think? Note also that 'atomic_add_unless' in synchronize_xxx() is not strictly necessary, it is just for "symmetry", we can do void synchronize_xxx(struct xxx_struct *sp) { int idx; mutex_lock(&sp->mutex); idx = sp->completed & 0x1; if (!atomic_read(sp->ctr + idx) goto out; atomic_dec(sp->ctr + idx); atomic_inc(sp->ctr + (idx ^ 0x1)); sp->completed++; wait_event(sp->wq, !atomic_read(sp->ctr + idx)); out: mutex_unlock(&sp->mutex); } instead. So the only complication I can see is the 'for' loop in xxx_read_lock(). Does it worth adding sp->lock ? Anyway, s/xxx/WHAT ???/ ? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/