Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755133AbZF1LLY (ORCPT ); Sun, 28 Jun 2009 07:11:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752294AbZF1LLL (ORCPT ); Sun, 28 Jun 2009 07:11:11 -0400 Received: from mail-bw0-f213.google.com ([209.85.218.213]:61042 "EHLO mail-bw0-f213.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751406AbZF1LLJ (ORCPT ); Sun, 28 Jun 2009 07:11:09 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=OAXyYBQbdE0wJETWapoNy6p3kgHC7Sr9n8UnthVD8zcFgoXHb1i+1TmuuDsT91Ahc9 UCdMT6E5Ih8Mz+Bh75lb2TFf6V2FiXD542KeU0pkhMwo5xtthzEhA+kGv13W9Ke+H0wd e23hDmIqVSXxpARW/KmOswCmTh4XWHyjaM/VQ= Message-ID: <4A474FB5.4070901@gmail.com> Date: Sun, 28 Jun 2009 13:10:45 +0200 From: Jarek Poplawski User-Agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: Oleg Nesterov CC: Davide Libenzi , Eric Dumazet , Jiri Olsa , netdev@vger.kernel.org, Linux Kernel Mailing List , fbl@redhat.com, nhorman@redhat.com, davem@redhat.com, Tejun Heo Subject: Re: [PATCH] net: fix race in the receive/select References: <20090625122545.GA3625@jolsa.lab.eng.brq.redhat.com> <20090625122416.GA23613@redhat.com> <4A442B65.8040701@gmail.com> <4A443033.8060401@gmail.com> <20090626135742.GB3845@redhat.com> <20090626145027.GA6534@redhat.com> In-Reply-To: <20090626145027.GA6534@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2472 Lines: 77 Oleg Nesterov wrote, On 06/26/2009 04:50 PM: > On 06/26, Davide Libenzi wrote: >> On Fri, 26 Jun 2009, Oleg Nesterov wrote: >> >>> And if we remove waitqueue_active() in xxx_update(), then lock/unlock is >>> not needed too. >>> >>> If xxx_poll() takes q->lock first, it can safely miss the changes in ->status >>> and schedule(): xxx_update() will take q->lock, notice the sleeper and wake >>> it up (ok, it will set ->triggered but this doesn't matter). >>> >>> If xxx_update() takes q->lock first, xxx_poll() must see the changes in >>> status after poll_wait()->unlock(&q->lock) (in fact, after lock, not unlock). >> Sure. The snippet above was just to show what typically the code does, not >> a suggestion on how to solve the socket case. > > Yes, yes. I just meant you are right imho, we shouldn't add mb() into > add_wait_queue(). > >> But yeah, the problem in this case is the waitqueue_active() call. Without >> that, the wait queue lock/unlock in poll_wait() and the one in wake_up() >> guarantees the necessary barriers. >> Some might argue the costs of the lock/unlock of q->lock, and wonder if >> MBs are a more efficient solution. This is something I'm not going into. >> To me, it just looked not right having cross-matching MB in different >> subsystems. > > This is subjective and thus up to maintainers, but personally I think you > are very, very right. > > Perhaps we can add > > void sock_poll_wait(struct file *file, struct sock *sk, poll_table *pt) > { > if (pt) { > poll_wait(file, sk->sk_sleep, pt); > /* > * fat comment > */ > smp_mb(); // or smp_mb__after_unlock(); > } > } > > Oleg. Maybe 'a bit' further?: static inline void __poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) { p->qproc(filp, wait_address, p); } static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) { if (p && wait_address) __poll_wait(filp, wait_address, p); } static inline void sock_poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) { if (p && wait_address) { __poll_wait(filp, wait_address, p); /* * fat comment */ smp_mb(); // or smp_mb__after_unlock(); } } Jarek P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/