Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758445AbZFZSFU (ORCPT ); Fri, 26 Jun 2009 14:05:20 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751312AbZFZSFG (ORCPT ); Fri, 26 Jun 2009 14:05:06 -0400 Received: from mx2.redhat.com ([66.187.237.31]:56892 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750815AbZFZSFF (ORCPT ); Fri, 26 Jun 2009 14:05:05 -0400 Date: Fri, 26 Jun 2009 16:50:27 +0200 From: Oleg Nesterov To: Davide Libenzi Cc: Eric Dumazet , Jiri Olsa , netdev@vger.kernel.org, Linux Kernel Mailing List , fbl@redhat.com, nhorman@redhat.com, davem@redhat.com, Tejun Heo Subject: Re: [PATCH] net: fix race in the receive/select Message-ID: <20090626145027.GA6534@redhat.com> References: <20090625122545.GA3625@jolsa.lab.eng.brq.redhat.com> <20090625122416.GA23613@redhat.com> <4A442B65.8040701@gmail.com> <4A443033.8060401@gmail.com> <20090626135742.GB3845@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1789 Lines: 51 On 06/26, Davide Libenzi wrote: > > On Fri, 26 Jun 2009, Oleg Nesterov wrote: > > > And if we remove waitqueue_active() in xxx_update(), then lock/unlock is > > not needed too. > > > > If xxx_poll() takes q->lock first, it can safely miss the changes in ->status > > and schedule(): xxx_update() will take q->lock, notice the sleeper and wake > > it up (ok, it will set ->triggered but this doesn't matter). > > > > If xxx_update() takes q->lock first, xxx_poll() must see the changes in > > status after poll_wait()->unlock(&q->lock) (in fact, after lock, not unlock). > > Sure. The snippet above was just to show what typically the code does, not > a suggestion on how to solve the socket case. Yes, yes. I just meant you are right imho, we shouldn't add mb() into add_wait_queue(). > But yeah, the problem in this case is the waitqueue_active() call. Without > that, the wait queue lock/unlock in poll_wait() and the one in wake_up() > guarantees the necessary barriers. > Some might argue the costs of the lock/unlock of q->lock, and wonder if > MBs are a more efficient solution. This is something I'm not going into. > To me, it just looked not right having cross-matching MB in different > subsystems. This is subjective and thus up to maintainers, but personally I think you are very, very right. Perhaps we can add void sock_poll_wait(struct file *file, struct sock *sk, poll_table *pt) { if (pt) { poll_wait(file, sk->sk_sleep, pt); /* * fat comment */ smp_mb(); // or smp_mb__after_unlock(); } } Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/