Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp3237571ybz; Mon, 27 Apr 2020 12:26:36 -0700 (PDT) X-Google-Smtp-Source: APiQypIIwEX+KVeYgeA5ll36wGcKEeVMnCSWxnyKjBsXCdCPTp3Z+EENtBvML9+FCnPawNeIvMpE X-Received: by 2002:aa7:dc4b:: with SMTP id g11mr10417383edu.223.1588015596108; Mon, 27 Apr 2020 12:26:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588015596; cv=none; d=google.com; s=arc-20160816; b=iuDg0i6WfYBXBHc5dDncqvQJrmmZwxtkrW28qWoTAlC3wocerRSIL3A8MWM2VIwo2J mPDOl012f0XwA1ohhLdADxV3ze+QYbw9GNdb0uZNQkNj+f8NhFRFUEqOlGKnpDExtBhs +qZj5RotXdXYOyPb58+Wm2wVGvx9TU6kKrnraMtxnPP/UYK2KaDhIQZNeOWb3FxrchFt lf1I7uPyadr3K1esJV9JQAHJ4heTc98Fe4UqRd2+gg6A13DAjRoQfm1Ij6+zljxCNmoN /5I2f/JPUsvGAtk7TVhhon3QSmKrHgXXT7eH+n2UvztY+GXNUC3KMaCX1029f1DI67a+ CDJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=1DvVNIPIAH44tgCVNcf26EZzNiQXMHn0gViYaVVxC1w=; b=uKsV3WtiqBz4vY1OU5d8dLDSpeOTcBJtSAGGgFMfQlwtnoNa7Tl7EbzieS9qFw/P2v qCZfy4TiVPI8fw3sIzBPW1QYgwNyFqhECxkko9OO1b3QMjRFFNxYyZ72E97NznMtOK8w DEe3O8DleFH1vhRDsotle03HOXKt1T0nCEEy+e6/2zY9Vy1ZsBGsIFg3JHANVsJDd1Yj JBXJXkQoqq7t/UDwq5w35AGvqBtTMbqCy9Pc49A82VSaWiVZJw77AV6OAR4zPFxnLqBf hvS4B6VrbxANbok5ViKE30p5V0+WEB+5ktsGDQrr8WssGnItaV4fXm7iw+JeLfV90Seu v4pA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="ARK9w/7v"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cf13si361108ejb.78.2020.04.27.12.26.12; Mon, 27 Apr 2020 12:26:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="ARK9w/7v"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726531AbgD0TYW (ORCPT + 99 others); Mon, 27 Apr 2020 15:24:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:42806 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726023AbgD0TYV (ORCPT ); Mon, 27 Apr 2020 15:24:21 -0400 Received: from paulmck-ThinkPad-P72.home (50-39-105-78.bvtn.or.frontiernet.net [50.39.105.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id BE77B2078C; Mon, 27 Apr 2020 19:24:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1588015460; bh=Vp0vnRvKFg1r1dD6nYGeZLym1g/qiYVKG/3/dYtACSI=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=ARK9w/7v+ttJmUZ8ezHZ3hnJkjbfAfH6DY8lhhhFAlCzT6Ml+1GrZMnA8IirtO/no B/j1PZ2ONXnGLAlEz7o+/REseF4JTijbD4eu00/c8xGSp32iifSa8vqJv1r5lKFiJC GmDIm0lfcj+2i9NN46J6w+TedGgSoNUQJFjjKl18= Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id A050D35227CC; Mon, 27 Apr 2020 12:24:20 -0700 (PDT) Date: Mon, 27 Apr 2020 12:24:20 -0700 From: "Paul E. McKenney" To: Will Deacon Cc: Jann Horn , Peter Zijlstra , kernel list , Eric Dumazet , Kees Cook , Maddie Stone , Marco Elver , Thomas Gleixner , kernel-team , Kernel Hardening , Oleg Nesterov Subject: Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Message-ID: <20200427192420.GL7560@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <20200324153643.15527-1-will@kernel.org> <20200324153643.15527-4-will@kernel.org> <20200324165128.GS20696@hirez.programming.kicks-ass.net> <20200324213200.GA21176@willie-the-truck> <20200330231315.GZ19865@paulmck-ThinkPad-P72> <20200424173932.GK21141@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200424173932.GK21141@willie-the-truck> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 24, 2020 at 06:39:33PM +0100, Will Deacon wrote: > On Mon, Mar 30, 2020 at 04:13:15PM -0700, Paul E. McKenney wrote: > > On Tue, Mar 24, 2020 at 09:32:01PM +0000, Will Deacon wrote: > > > [mutt crashed while I was sending this; apologies if you receive it twice] > > > > > > On Tue, Mar 24, 2020 at 05:56:15PM +0100, Jann Horn wrote: > > > > On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra wrote: > > > > > On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote: > > > > > > diff --git a/include/linux/list.h b/include/linux/list.h > > > > > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644 > > > > > > --- a/include/linux/list.h > > > > > > +++ b/include/linux/list.h > > > > > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list, > > > > > > */ > > > > > > static inline int list_empty(const struct list_head *head) > > > > > > { > > > > > > - return READ_ONCE(head->next) == head; > > > > > > + return data_race(READ_ONCE(head->next) == head); > > > > > > } > > > > > > > > > > list_empty() isn't lockless safe, that's what we have > > > > > list_empty_careful() for. > > > > > > > > That thing looks like it could also use some READ_ONCE() sprinkled in... > > > > > > Crikey, how did I miss that? I need to spend some time understanding the > > > ordering there. > > > > > > So it sounds like the KCSAN splats relating to list_empty() and loosely > > > referred to by 1c97be677f72 ("list: Use WRITE_ONCE() when adding to lists > > > and hlists") are indicative of real bugs and we should actually restore > > > list_empty() to its former glory prior to 1658d35ead5d ("list: Use > > > READ_ONCE() when testing for empty lists"). Alternatively, assuming > > > list_empty_careful() does what it says on the tin, we could just make that > > > the default. > > > > The list_empty_careful() function (suitably annotated) returns false if > > the list is non-empty, including when it is in the process of becoming > > either empty or non-empty. It would be fine for the lockless use cases > > I have come across. > > Hmm, I had a look at the implementation and I'm not at all convinced that > it's correct. First of all, the comment above it states: > > * NOTE: using list_empty_careful() without synchronization > * can only be safe if the only activity that can happen > * to the list entry is list_del_init(). Eg. it cannot be used > * if another CPU could re-list_add() it. Huh. This thing is unchanged since 2.6.12-rc2, back in 2005: static inline int list_empty_careful(const struct list_head *head) { struct list_head *next = head->next; return (next == head) && (next == head->prev); } I can imagine compiler value-caching optimizations that would cause trouble, for example, if a previous obsolete fetch from head->prev was lying around in a register, causing this function to say "not empty" when it was in fact empty. Of course, if obsolete values for both head->next and head->prev were lying around, pretty much anything could happen. > but it seems that people disregard this note and instead use it as a > general-purpose lockless test, taking a lock and rechecking if it returns > non-empty. It would also mean we'd have to keep the WRITE_ONCE() in > INIT_LIST_HEAD, which is something that I've been trying to remove. > > In the face of something like a concurrent list_add(); list_add_tail() > sequence, then the tearing writes to the head->{prev,next} pointers could > cause list_empty_careful() to indicate that the list is momentarily empty. > > I've started looking at whether we can use a NULL next pointer to indicate > an empty list, which might allow us to kill the __list_del_clearprev() hack > at the same time, but I've not found enough time to really get my teeth into > it yet. In the delete-only case, I kind of get it, other than the potential for optimization. Once the list becomes empty, it will forever remain empty. And the additional test of head->prev avoids this returning true while the deletion is half done (again, aside from the potential for optimization). If insertions are allowed, the thing I haven't quite figured out yet is what is being gained by the additional check of head->prev. After all, if updates are not excluded, the return value can become obsolete immediately anyhow. Yes, it could be used as a heuristic, but it could report empty immediately before a list_add(), so there would need to either be a careful wakeup protocol or a periodic poll of the list. Or am I missing a trick here? Thanx, Paul