Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp5508587rwb; Sun, 4 Dec 2022 23:21:46 -0800 (PST) X-Google-Smtp-Source: AA0mqf4GHYyWuS7H1FStfZytBkpxB8qQJPyiizKzcUrAzrM6f/PYfvWptZ/Mk90fmszlveCD6sbF X-Received: by 2002:a50:bac6:0:b0:46b:ada3:4ee with SMTP id x64-20020a50bac6000000b0046bada304eemr20318334ede.64.1670224906262; Sun, 04 Dec 2022 23:21:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670224906; cv=none; d=google.com; s=arc-20160816; b=CqbOSPkJFZ+4ji7diPUuAsdjCOgFYiHn50rkdMJ+z6sapf5IPDocfygdgRZQgRaUvr 5q1lzjFVpT3pcEW/t0sMoZsTptdSEFqpzDZ+gBfFnfomlgXEbU6jgrEC9Nc0xizv8TTx yacRoxO1upSInlVwzfbaFPABVe6RE9UU1sJcEasiXUbiAOIs8SXN/U7OWL52l+UdPU49 yPweu4TZ/Pf+SPH9H4YQDJcMBjrH8+EHEr4OOQsGu9k5H5tRhKPB53ZgUkPoj87QQsuA rQ62q7RD+9vo6uEnxr5YXBkIcjc6HiQwZ9F29x6dRUCOqBwXJ5gZFM2IRxhZJiNm8XfF h22Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=coRXOHRNz//WTvn8iz/sAzTb1yJBt0dUNQrHlt2mpPs=; b=K6qBQm+CmbsOR4NE/n0OgIkXsKg1LUZcZmHc7vn9bCdsUQr86/pKns5JX0P5vfqajJ 5mDr68pZKLSmL/wxneuMzO8jDikGk9ATXs0OmKWpxLUrJzZUwfQ1UNfCCNZPWQsuyg0+ Y8YxKwJeSySY3/v6ZuzX2UWf73wxIxydSN7TBIUVdWVYcHXfWuMbJ2NFQZFz9YFWNY3/ yD1+JV23ulQvxcONfEgW3g+YNikgf6ZcYfAZ7ShMRR0AAQL8PgQV1KXNzFExjOFNlnZR zF9sxKzN/HWWEa+TrQgLMwFWLleUaijTC31MF71N+alp3EuKWarII2EAG+e3AkDSyFiF 102g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=cp2c4vLv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sd8-20020a1709076e0800b007ae43ee86aesi11961544ejc.69.2022.12.04.23.21.24; Sun, 04 Dec 2022 23:21:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=cp2c4vLv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231292AbiLEHBN (ORCPT + 82 others); Mon, 5 Dec 2022 02:01:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231421AbiLEHAv (ORCPT ); Mon, 5 Dec 2022 02:00:51 -0500 Received: from mail-yw1-x1132.google.com (mail-yw1-x1132.google.com [IPv6:2607:f8b0:4864:20::1132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9100BF7C for ; Sun, 4 Dec 2022 23:00:46 -0800 (PST) Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-3704852322fso108443177b3.8 for ; Sun, 04 Dec 2022 23:00:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=coRXOHRNz//WTvn8iz/sAzTb1yJBt0dUNQrHlt2mpPs=; b=cp2c4vLvpDWA0mcSMbgN+VRWTA9DqaDSNKkxVBxIu7MdTZray7sSNLrIl20bdFnlL0 KuswNgOgKZuar9qFfg3zDSP4eB3V30hL4u4a+vthyZ1dLzDdOwLeqFzDA1SxC2oD2F/K c5K1UCDm42XTnU8MaTGaMgVCR9zjcLmPb/VNuXAGVsFac9utRPXA7IvkYQ1vxJtAhX/z PkdIujfBoBkGmeir5HfFGNu1a0u5KsmdQEUE0YEc2CvTj7XseBcUhWCeqaREWYmEz8Ow 6Hivt5EJmmtmp6I60cJKxtybuRswUNjkPAK69gPiKXRrGAl++llFBfg3Fyh5/C2ZxsSz xquA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=coRXOHRNz//WTvn8iz/sAzTb1yJBt0dUNQrHlt2mpPs=; b=wn/j/c9ZKj7LLnnoYLDmhimvLFdS35WSa0uoYgBrec6IpxNJzRUm5jNKg0ktHVCTqj 5m/MHLcTG9Vv0ljVBTd8nbmVTdDjmR1SHmkPgTrNariybnVW5JCUcUr+1PLF/3y7bAH9 3QuIVcpXH94XaC7sTPXViqjJtE2ZA+2ZkhnkENhBRRVpHJekal2BpkMCevi1rB3ePbeU diEMoAER+bOKBSQlCEkyvschzm6AcgrpCSjorgusvW60r+PwJtsiqlV3N1JyphYPed2a PYy4Tvd10OQoILelvgJCXU5ReRSp8ANZbuspdZChB4TUOROlEVcUNYTPBcjbw3mG3M8r cn8w== X-Gm-Message-State: ANoB5pmXrosFZ2jvAgesAqqr9gojEyKglf/rP89YbUvp5Ndwrxl9xIMS 481GMakJNW18DfIz8vQijZvJJQvUpVrUc4M++k6w+g== X-Received: by 2002:a81:1915:0:b0:3bf:9e45:1139 with SMTP id 21-20020a811915000000b003bf9e451139mr39230763ywz.267.1670223645646; Sun, 04 Dec 2022 23:00:45 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Marco Elver Date: Mon, 5 Dec 2022 08:00:00 +0100 Message-ID: Subject: Re: arm64: allmodconfig: BUG: KCSAN: data-race in p9_client_cb / p9_client_rpc To: Dominique Martinet Cc: Naresh Kamboju , rcu , open list , kunit-dev@googlegroups.com, lkft-triage@lists.linaro.org, kasan-dev , "Paul E. McKenney" , Netdev , Anders Roxell Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 4 Dec 2022 at 00:08, Dominique Martinet wrote: > > Marco Elver wrote on Sat, Dec 03, 2022 at 05:46:46PM +0100: > > > But I can't really find a problem with what KCSAN complains about -- > > > we are indeed accessing status from two threads without any locks. > > > Instead of a lock, we're using a barrier so that: > > > - recv thread/cb: writes to req stuff || write to req status > > > - p9_client_rpc: reads req status || reads other fields from req > > > > > > Which has been working well enough (at least, without the barrier things > > > blow up quite fast). > > > > > > So can I'll just consider this a false positive, but if someone knows > > > how much one can read into this that'd be appreciated. > > > > The barriers only ensure ordering, but not atomicity of the accesses > > themselves (for one, the compiler is well in its right to transform > > plain accesses in ways that the concurrent algorithm wasn't designed > > for). In this case it looks like it's just missing > > READ_ONCE()/WRITE_ONCE(). > > Aha! Thanks for this! > > I've always believed plain int types accesses are always atomic and the > only thing to watch for would be compilers reordering instrucions, which > would be ensured by the barrier in this case, but I guess there are some > architectures or places where this isn't true? > > > I'm a bit confused though, I can only see five places where wait_event* > functions use READ_ONCE and I believe they more or less all would > require such a marker -- I guess non-equality checks might be safe > (waiting for a value to change from a known value) but if non-atomic > updates are on the table equality and comparisons checks all would need > to be decorated with READ_ONCE; afaiu, unlike usespace loops with > pthread_cond_wait there is nothing protecting the condition itself. > > Should I just update the wrapped condition, as below? > > - err = wait_event_killable(req->wq, req->status >= REQ_STATUS_RCVD); > + err = wait_event_killable(req->wq, > + READ_ONCE(req->status) >= REQ_STATUS_RCVD); Yes, this looks good! > The writes all are straightforward, there's all the error paths to > convert to WRITE_ONCE too but that's not difficult (leaving only the > init without such a marker); I'll send a patch when you've confirmed the > read looks good. > (the other reads are a bit less obvious as some are protected by a lock > in trans_fd, which should cover all cases of possible concurrent updates > there as far as I can see, but this mixed model is definitely hard to > reason with... Well, that's how it was written and I won't ever have time > to rewrite any of this. Enough ranting.) If the lock-protected accesses indeed are non-racy, they should be left unmarked. If some assumption here turns out to be wrong, KCSAN would (hopefully) tell us one way or another. Thanks! -- Marco