Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp153605ybl; Tue, 28 Jan 2020 00:21:47 -0800 (PST) X-Google-Smtp-Source: APXvYqwUhR/qfssMR7tjrjAJKD3+21VLYOBkK62KejMy84BUp1bSkYPcE5gwAIpdyaN2aSY4SbKK X-Received: by 2002:a05:6808:5d0:: with SMTP id d16mr2093163oij.45.1580199707591; Tue, 28 Jan 2020 00:21:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580199707; cv=none; d=google.com; s=arc-20160816; b=ggh+LVFhvh6/zg9azwSzWXAEVgvlB6kJ0uq4Lh64VTlL8Vy+10eHrcdKS9HUomDJ46 MN8FX8RXte9IEs+o/NoZTS03lBfmi+kachikuJqNHqwvf2mEzZtR0bDgxhlcAkRMKxeh IqJlltqCIxTXZ77WDrDujraZRy90L1keoySt+w07HH2j6ODLXxCwu1UgpuGf4U3+rclw UK+EMz6vsfnhkhnj9kvor6CACqeF4IWOqa299lqYnsP+08Q86RT5CFv6ZYtcBTc55WLq +aDLeGwje3NNt0ftzpsCFVN9fU07j0Xqi3xad3ImLJ5iJd2P6ZH0lzkacn/ybLjpvcaQ xuKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=4GlTMy4Y3SA0CBLCaMbQ0jYDnrNseGrKS93a9KZAwb8=; b=bE/5l5Mth6/pVOnbv8vNsuwXlMRUP4to4k3+7wAFd4IwRz3OZCtQfjfZeu1E5cjN5h WRh//GeRuPu9xMGeMn7d2U30g3f0Nc6VTiniNGW1tQpVFZgMvAEux6SggkJEjHDfUpEj IZYDc8QfybpXB+JUJvQp4vOhxMZR5mlipAr8ozm/D28b4RBtPIrzMKRYqhHrHglVIAAx poe8xrXDAff+sGZXvXMQf0NmD3RNnyu8tLlD1NkT+A0I5blBCc/+VgMX+i4t6Xci1yyO Adls7093f33NfI8WoBkhx0HJdFYFHS/JV7imcy0o24xBFfhR2F93ddwNGSAYGdXDTdZB r+Kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=kDwUCNrg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f14si9035300otf.178.2020.01.28.00.21.34; Tue, 28 Jan 2020 00:21:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=kDwUCNrg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725882AbgA1ISv (ORCPT + 99 others); Tue, 28 Jan 2020 03:18:51 -0500 Received: from mail-ot1-f66.google.com ([209.85.210.66]:33132 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725839AbgA1ISv (ORCPT ); Tue, 28 Jan 2020 03:18:51 -0500 Received: by mail-ot1-f66.google.com with SMTP id b18so11227465otp.0 for ; Tue, 28 Jan 2020 00:18:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4GlTMy4Y3SA0CBLCaMbQ0jYDnrNseGrKS93a9KZAwb8=; b=kDwUCNrgRn/3C3bVB1kRgtG3n1ky4uQEXolzTBOduE7+eKFSIiCQkfy2Xk9MMjTU6t wEDt6leL/UQlPwoIF8PTTAPFyM4adqgKKPcAnie/KcxW5xumW9JDG2F08fUV+cEeFoPw dqy4fVcsBrWwHcnbOW0hBkFWBpn/9NzX6M6//cvd090dprM5stdXYf4dGwsmW1alw7oK ZqM+rsXKrUDanW6RLofYLIMqEERwj4xJJMsJWHhyjsf6OqDlTHEwNag6qpORIYJiUJhv 6arZ3U3OLhqh82NANalaMASeIzpNjxdPIfCxcSME+ic3GbZAHIIdKlk6/53yUXmuymIC IFyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4GlTMy4Y3SA0CBLCaMbQ0jYDnrNseGrKS93a9KZAwb8=; b=Gx0pRTQQlx9dEo8CDjYeeMSTyUFhGlKiSkW0/+6z4Ttp6LwOG1D5ken0VDEslp4xRf ZcjBcgeHx0PYwQxH7wvl71g7bZGa7RKLQuOXzhXdBywiVB+qnQ+BQghQbVFOz6XeXNOa QkzLYzhsY9w70K/+gjLTwS5GiwU5NzY3nBixcROpwoIt2GJx4QFskoe1g/EdGz2mvg0t UMbOJSLl6Ph5tcXa8WAzG8VLhgRG86V9sPBiWe1BbBgjSHIxN63J3Q9R1joSURk8IWvU //jvDuqdlQILiP98LCkZS2Kdw+Eg+Fm155QRE6V76o5a/OPpn28qzi9AekVxU2IpXG9C zHJg== X-Gm-Message-State: APjAAAVrTDcoQGNnFFran4onFjTFOrU+FeG7RR3VAAeN5fX2ymzQIpH4 dAMCElRnpSA5cR22Lv0xebbWmnH1hVv+P5GFYiW9mSltalM= X-Received: by 2002:a05:6830:1d7b:: with SMTP id l27mr14494358oti.251.1580199529855; Tue, 28 Jan 2020 00:18:49 -0800 (PST) MIME-Version: 1.0 References: <20200122165938.GA16974@willie-the-truck> <20200122223851.GA45602@google.com> <20200123093604.GT14914@hirez.programming.kicks-ass.net> <2E13BFD2-A2E5-4CAA-B0D0-0DF2F5529F1B@lca.pw> In-Reply-To: <2E13BFD2-A2E5-4CAA-B0D0-0DF2F5529F1B@lca.pw> From: Marco Elver Date: Tue, 28 Jan 2020 09:18:38 +0100 Message-ID: Subject: Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next To: Qian Cai Cc: Peter Zijlstra , Will Deacon , Ingo Molnar , Linux Kernel Mailing List , "paul E. McKenney" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 28 Jan 2020 at 04:13, Qian Cai wrote: > > > On Jan 23, 2020, at 4:36 AM, Peter Zijlstra wrote: > > > > On Wed, Jan 22, 2020 at 11:38:51PM +0100, Marco Elver wrote: > > > >> If possible, decode and get the line numbers. I have observed a data > >> race in osq_lock before, however, this is the only one I have recently > >> seen in osq_lock: > >> > >> read to 0xffff88812c12d3d4 of 4 bytes by task 23304 on cpu 0: > >> osq_lock+0x170/0x2f0 kernel/locking/osq_lock.c:143 > >> > >> while (!READ_ONCE(node->locked)) { > >> /* > >> * If we need to reschedule bail... so we can block. > >> * Use vcpu_is_preempted() to avoid waiting for a preempted > >> * lock holder: > >> */ > >> --> if (need_resched() || vcpu_is_preempted(node_cpu(node->prev))) > >> goto unqueue; > >> > >> cpu_relax(); > >> } > >> > >> where > >> > >> static inline int node_cpu(struct optimistic_spin_node *node) > >> { > >> --> return node->cpu - 1; > >> } > >> > >> > >> write to 0xffff88812c12d3d4 of 4 bytes by task 23334 on cpu 1: > >> osq_lock+0x89/0x2f0 kernel/locking/osq_lock.c:99 > >> > >> bool osq_lock(struct optimistic_spin_queue *lock) > >> { > >> struct optimistic_spin_node *node = this_cpu_ptr(&osq_node); > >> struct optimistic_spin_node *prev, *next; > >> int curr = encode_cpu(smp_processor_id()); > >> int old; > >> > >> node->locked = 0; > >> node->next = NULL; > >> --> node->cpu = curr; > >> > > > > Yeah, that's impossible. This store happens before the node is > > published, so no matter how the load in node_cpu() is shattered, it must > > observe the right value. > > Marco, any thought on how to do something about this? The worry is that > too many false positives like this will render the tool usefulness as a > general debug option. This should be an instance of same-value-store, since the node->cpu is per-CPU and smp_processor_id() should always be the same, at least once it's published. I believe the data race I observed here before KCSAN had KCSAN_REPORT_VALUE_CHANGE_ONLY on syzbot, and hasn't been observed since. For the most part, that should deal with this case. I will reply separately to your other email about the other data race. Thanks, -- Marco