Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp4685777ybf; Wed, 4 Mar 2020 08:41:27 -0800 (PST) X-Google-Smtp-Source: ADFU+vsDiBcaDNOMl3zDoBDQd2rorMoECJmVpFvJ3wKIAfNSENvwgdxAy4711e3pDfau84lB28Du X-Received: by 2002:a9d:6ad1:: with SMTP id m17mr3244242otq.198.1583340087312; Wed, 04 Mar 2020 08:41:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583340087; cv=none; d=google.com; s=arc-20160816; b=jn+XO0XqT7LHQkVVUr3WEUbYVt1hGXx3pVIeWZnyG9Y0baCwmlTJNdirC7RtBYLPlu 3MVkWVTCEm7a5Fn4QJISdp791MX/Hv2kVNe1qx31XSelRIfAFXCCG5BVRbAhLVMa/8O7 Y5uz0GhtoJ/6NGnAO8bOqc/8xfrKiEfjw49CmavWTVW9JQr320pis2Eku/s+LiUr2TlH Pe0cMoXDVkiHDha8z05IItl5AkeU6BbmbTsuxMA9yL4AqPpLnwFlLd6NmCfCCTYzsKZu HEr2sMUfHxmWHLQPcQ3lyZoaAukJGR6rGlMWpwcT/CKOZk/KGryKxvGi3WK6uC/3qkKt iV3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=CvrEvD+7opql/JJxEjmiYA8aCM4NclmuHbRnspthzOQ=; b=Qrxy6ugW62D91Q9Z0Zty/fShM2TD2B6e6NLT7AL6A6Z13cSsCb8CRnwrAxbiohydqW dwiNaZgsGfr0V2nEntgYSjLnkjT0Ydl7BmFU2SdyAt6oE3kHcTm/e7PKyCxbtvU8RoP3 /IjnB+Cn7qm6FU/6LA8wFRsiCKQ9exh37Yw3ULsex9iqB/Gffmop9DKFEAe2iCkmslUQ 304igCAPaiqHCaZVOrXEQU+N9s9mr4vPq0/hMM0rldZeBQlQmGC1Jr2ETODkvFjcRIAF WEaCaYrcBpODML7+tj6NrxKJ2r5YTSA6z4JMf5pq4tEDp2GDQzsFHngQ55qTg7XKYDnt hYxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=L2owA0PX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j184si1459573oih.216.2020.03.04.08.41.14; Wed, 04 Mar 2020 08:41:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=L2owA0PX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729749AbgCDQke (ORCPT + 99 others); Wed, 4 Mar 2020 11:40:34 -0500 Received: from mail-ot1-f66.google.com ([209.85.210.66]:38700 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729573AbgCDQke (ORCPT ); Wed, 4 Mar 2020 11:40:34 -0500 Received: by mail-ot1-f66.google.com with SMTP id i14so2648632otp.5 for ; Wed, 04 Mar 2020 08:40:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=CvrEvD+7opql/JJxEjmiYA8aCM4NclmuHbRnspthzOQ=; b=L2owA0PXi7PzoM+3h/nzHuuXgk1zbzhEGtVGOA4bQCwug/mMMXHJccNxFPxih+xQt/ ryLN/Y4lNRKBXmG4qFjviZejnmh+2amvXH2RRLwEAg2D6MCM0RD+JC2WEM+NeHCIWO98 CcGOjFV03LVLF2O7sWH92P9M/BbcVd+Ps3nnomI4JhVD3fHJfeNube3YCVi+aKwRLB7p NAXvZzCASro4QZlCyryKOx4LGFX+gSx0ldJS4uacl2vQ4AMKtE4oIlornXxQcBjJnbCg quOJFhbGOQznZPSevfO+36c1dRLGNRfI6Rwq6mLvLirshaXa3CyJqIwfKU42G12rlu5A QX/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CvrEvD+7opql/JJxEjmiYA8aCM4NclmuHbRnspthzOQ=; b=iGWhpwyfD5g7b1i14mQ7t14fBSIMh6+69ZbalIMgW3aD46JIQFLH9yXU0+eYN/UC8T Y9v3ZgpJ/5lbOW0elHpgV9sM2GcSOeM/hhlsQ+SP90y+3nhp9hfX+EEJxRuNbOZruQ+0 cDqR6NITj/SmOgkEyw6CvWgkQSF392eZ+XiQIw2UipAKUDRjH/T3UP3i/juuQyVwb3pj wAT2BcTQjadgI/f5wZn6GsEK2Ng8oq5SyBMcwo5s9Jt2HIqyfcofJotj1rCERGO78KMu FCcF9coUGHjJU4YRIL0jdprRNgxhgglKE3bT9BP7Z/efJcTVIKANWABSz/SyYU2vnCnr QJCQ== X-Gm-Message-State: ANhLgQ3VzFp1n+CuFfPP3L7xfi5UCnhKAj1hFEoOM+6g+TeaGZ55cQIo +X5UnojSTTew1pIMbV7QRAMV0zMzs1hkGGSJ2hYUFA== X-Received: by 2002:a05:6830:16c8:: with SMTP id l8mr3013088otr.2.1583340032952; Wed, 04 Mar 2020 08:40:32 -0800 (PST) MIME-Version: 1.0 References: <20200304031551.1326-1-cai@lca.pw> <20200304033329.GZ29971@bombadil.infradead.org> <20200304040515.GX2935@paulmck-ThinkPad-P72> <20200304043356.GC29971@bombadil.infradead.org> <20200304141021.GY2935@paulmck-ThinkPad-P72> In-Reply-To: <20200304141021.GY2935@paulmck-ThinkPad-P72> From: Marco Elver Date: Wed, 4 Mar 2020 17:40:21 +0100 Message-ID: Subject: Re: [PATCH -next] lib: disable KCSAN for XArray To: "Paul E. McKenney" Cc: Matthew Wilcox , Qian Cai , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 4 Mar 2020 at 15:10, Paul E. McKenney wrote: > > On Tue, Mar 03, 2020 at 08:33:56PM -0800, Matthew Wilcox wrote: > > On Tue, Mar 03, 2020 at 08:05:15PM -0800, Paul E. McKenney wrote: > > > On Tue, Mar 03, 2020 at 07:33:29PM -0800, Matthew Wilcox wrote: > > > > On Tue, Mar 03, 2020 at 10:15:51PM -0500, Qian Cai wrote: > > > > > Functions like xas_find_marked(), xas_set_mark(), and xas_clear_mark() > > > > > could happen concurrently result in data races, but those operate only > > > > > on a single bit that are pretty much harmless. For example, I currently do not see those as justification to blacklist the whole file. Wouldn't __no_kcsan be better? That is, in case there is no other solution that emerges from the remainder of the discussion here. > > > > Those aren't data races. The writes are protected by a spinlock and the > > > > reads by the RCU read lock. If the tool can't handle RCU protection, > > > > it's not going to be much use. > > > > > > Would KCSAN's ASSERT_EXCLUSIVE_BITS() help here? > > > > I'm quite lost in the sea of new macros that have been added to help > > with KCSAN. It doesn't help that they're in -next somewhere that I > > can't find, and not in mainline yet. Is there documentation somewhere? > > Yes, there is documentation. In -next somewhere. :-/ > > Early days, apologies for the construction! I just sent a patch updating documentation, just to make sure we have something up-to-date we can refer to in the meantime. A preview of generated documentation for ASSERT_EXCLUSIVE_BITS (and others) is here: https://htmlpreview.github.io/?https://github.com/google/ktsan/blob/kcsan-kerneldoc/output/dev-tools/kcsan.html#race-detection-beyond-data-races Hope that helps somewhat. > > > RCU readers -do- exclude pre-insertion initialization on the one hand, > > > and those post-removal accesses that follow a grace period, but only > > > if that grace period starts after the removal. In addition, the > > > accesses due to rcu_dereference(), rcu_assign_pointer(), and similar > > > are guaranteed to work even if they are concurrent. > > > > > > Or am I missing something subtle here? > > > > I probably am. An XArray is composed of a tree of xa_nodes: > > > > struct xa_node { > > unsigned char shift; /* Bits remaining in each slot */ > > unsigned char offset; /* Slot offset in parent */ > > unsigned char count; /* Total entry count */ > > unsigned char nr_values; /* Value entry count */ > > struct xa_node __rcu *parent; /* NULL at top of tree */ > > struct xarray *array; /* The array we belong to */ > > union { > > struct list_head private_list; /* For tree user */ > > struct rcu_head rcu_head; /* Used when freeing node */ > > }; > > void __rcu *slots[XA_CHUNK_SIZE]; > > union { > > unsigned long tags[XA_MAX_MARKS][XA_MARK_LONGS]; > > unsigned long marks[XA_MAX_MARKS][XA_MARK_LONGS]; > > }; > > }; > > > > 'shift' is initialised before the node is inserted into the tree. > > Ditto 'offset'. > > Very good, then both ->shift and ->offset can be accessed using normal > C-language loads and stores even by most strict definition of data race. > > > 'count' and 'nr_values' should only be touched with the > > xa_lock held. 'parent' might be modified with the lock held and an RCU > > reader expecting to see either the previous or new value. 'array' should > > not change once the node is inserted. private_list is, I believe, only > > modified with the lock held. 'slots' may be modified with the xa_lock > > held, and simultaneously read by an RCU reader. Ditto 'tags'/'marks'. > > If ->count and ->nr_values are never accessed by readers, they can also > use plain C-language loads and stores. > > KCSAN expects that accesses to the ->parent field should be marked. > But if ->parent is always accessed via things like rcu_dereference() > and rcu_assign_pointer() (guessing based on the __rcu), then KCSAN > won't complain. > > The ->array can be accessed using plain C-language loads and stores. > > If ->private_list is never accessed without holding the lock, then > plain C-language loads and stores work for it without KCSAN complaints. > > The situation with ->slots is the same as that for ->parent. > > KCSAN expects accesses to the ->tags[] and ->marks[] arrays to be marked. > However, the default configuration of KCSAN asks only that the reads > be marked. (Within RCU, I instead configure KCSAN so that it asks that > both be marked, but it is of course your choice within your code.) > > > The RCU readers are prepared for what they see to be inconsistent -- > > a fact of life when dealing with RCU! So in a sense, yes, there is a > > race there. But it's a known, accepted race, and that acceptance is > > indicated by the fact that the RCU lock is held. Does there need to be > > more annotation here? Or is there an un-noticed bug that the tool is > > legitimately pointing out? > > The answer to both questions is "maybe", depending on the code using > the values read. Yes, it would be nice if KCSAN could figure out the > difference, but there are limits to what a tool can do. And things > are sometimes no-obvious, for example: > > switch (foo) { > case 1: do_something_1(); break; > case 3: do_something_3(); break; > case 7: do_something_7(); break; > case 19: do_something_19(); break; > case 23: do_something_23(); break; > } > > Only one access to "foo", so all is well, right? > > Sadly, wrong. Compilers can create jump tables, and will often emit two > loads from "foo", one to check against the table size, and the other to > index the table. > > Other potential traps may be found in https://lwn.net/Articles/793253/ > ("Who's afraid of a big bad optimizing compiler?"). > > One approach is to use READ_ONCE() on the reads in the RCU read-side > critical section that are subject to concurrent update. Another is to > use the data_race() macro (as in data_race(foo) in the switch statement > above) to tell KCSAN that you have analyzed the compiler's response. > These first two can be mixed, if you like. And yet another is the patch > proposed by Qian if you want KCSAN to get out of your code altogether. > > Within RCU, I mark the accesses rather aggressively. RCU is quite > concurrent, and the implied documentation is very worthwhile. > > Your mileage may vary, of course. For one thing, your actuarial > statistics are quite likley significantly more favorable than are mine. > Not that mine are at all bad, particularly by the standards of a century > or two ago. ;-) > > Thanx, Paul