Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759929AbZCSWCV (ORCPT ); Thu, 19 Mar 2009 18:02:21 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752677AbZCSWCH (ORCPT ); Thu, 19 Mar 2009 18:02:07 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:37101 "EHLO e9.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753434AbZCSWCG (ORCPT ); Thu, 19 Mar 2009 18:02:06 -0400 Message-ID: <49C2C0D6.5080700@us.ibm.com> Date: Thu, 19 Mar 2009 15:01:58 -0700 From: Darren Hart User-Agent: Thunderbird 2.0.0.19 (X11/20090105) MIME-Version: 1.0 To: "lkml, " CC: Thomas Gleixner , Peter Zijlstra , Ingo Molnar , John Stultz , Jakub Jelinek , Ulrich Drepper , Eric Dumazet , Oleg Nesterov Subject: Re: check *uaddr==val after queueing - without faulting References: <49C2BCF4.50908@us.ibm.com> In-Reply-To: <49C2BCF4.50908@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2270 Lines: 46 Adding a few key folks to the Cc, apologies for the short initial Cc list. Darren Hart wrote: > The current futex_wait() code (I'm looking at tip/core/futexes) > conflicts with a warning in the comments about checking *uaddr==val > before the futex_q is queued on the hb list. While userspace is able to > alter *uaddr at will and should expect to hang in the kernel forever > should it do so haphazardly, there are legitimate scenarios where the > futex value might change between the call to futex_wait() and when the > futex_q gets on the hb list. > > For example, glibc protects access to the value of cond.__data.__futex > via the cond.__data.__lock. However, before it can issue the syscall it > has to drop the cond.__data.__lock, leaving a small race window where > userspace might issue a signal or broadcast, which will modify the value > of cond.__data.__futex. As I understand it, this will result in the > waiter having changed the value of the futex prior to entering the > kernel, but not enqueuing itself on the hb list until after the waiter > issues the broadcast that was intended to wake it up. > > I was working up a patch to move the test to after the call to > queue_me(), but in order to do the test we also have to perform the > get_user() after the queue_me(), which might sleep if we still hold the > hb->lock. If we let queue_me() drop the hb->lock before we call > get_user() then we may see a legitimate change in *uaddr that occured > after the queue_me() and before the get_user(). > > I'm at a loss for how to resolve the race without causing the false > positive inside the kernel. It might be resolvable in glibc by looking > at the return code from futex_requeue and checking if the number > woken_or_requeued agrees with the number it expected to be sleeping; > this likely leaves other gaps for other waking calls, like FUTEX_WAKE. > > Any thoughts? Am I missing something that guards against this race? > -- Darren Hart IBM Linux Technology Center Real-Time Linux Team -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/