Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp6922858ybf; Fri, 6 Mar 2020 07:06:20 -0800 (PST) X-Google-Smtp-Source: ADFU+vt2rTJ/f7aPpW5Lk5A1E8tcdXbCprUvPj9KM+mDvI0jCYDiagwbagPSEqxEVC3s+Pf3vlB2 X-Received: by 2002:a05:6830:d4:: with SMTP id x20mr2900604oto.243.1583507180266; Fri, 06 Mar 2020 07:06:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583507180; cv=none; d=google.com; s=arc-20160816; b=HxAvciimgpubrwRoGB1GeO8G38qr5JvG7KxbypnAF7nKLrL1y/gU2WEkaw2kpL8mBm r3Vm9M69nunVN0Ma1/nXa9KNtlIMGwk+afqYwPfx/abjJ5U9q90KbMf8bNIbv7FmXOnO grpKOyWmoDDGUlvKxuOIwaJqII5ySlOjDH+IHJtePcWkpe4YVlYBu7Dp4UqAD5J4KLPz uJJ54BWldLdR6lcFAv4EBBx26TO5qhHFII9Lk+NTitzd6J5P1QQu+1XuTwBdofv9bolr O4Upw/4yNytRMQ/tgFWLLlfbLx24Zy1QzSGuWUYGLu9STCqbhAEVsd4GzTuL2RpW84Z2 3ZRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature; bh=JZwOxDEE6FTlabp1Erp1Tw5Doyf6ABG7BAUGaMzAXBE=; b=OssNvIDk5GXugndMvDfoXTTw5dYN7QWFMR7Nd033K2ORP3aSopMDZ3j7NlUN52aFab XQ4LGKzDQSqr+Pe0BGwqZsSwCdtEvnPzXTetWDEcODb7WYG6k8m/ETjtbruMod4CuiPN 3VJCipXrTEz+FdtYQu8B9fWEeY7x57oYQaFN1BzvriaGcbOJOmtTZ4ED2GJyZltJCVXy 4wGRapxxVMvvAGv5FxzWCLvTikGKmPLU4eXabRsPWz0DJPNeSB89rawwHpoaMRbEYpxa vLPEvAo5F8IUfnkLZ7ynG6YVutHvLkbH+B9Bmla2YZVdQlpM3WAMBkji9ImkdIRkZsD3 8kQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=n3yfRISi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s207si1474522oih.255.2020.03.06.07.05.56; Fri, 06 Mar 2020 07:06:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@lca.pw header.s=google header.b=n3yfRISi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727141AbgCFPC5 (ORCPT + 99 others); Fri, 6 Mar 2020 10:02:57 -0500 Received: from mail-qv1-f67.google.com ([209.85.219.67]:41905 "EHLO mail-qv1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726171AbgCFPC5 (ORCPT ); Fri, 6 Mar 2020 10:02:57 -0500 Received: by mail-qv1-f67.google.com with SMTP id s15so1041232qvn.8 for ; Fri, 06 Mar 2020 07:02:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lca.pw; s=google; h=message-id:subject:from:to:cc:date:in-reply-to:references :mime-version:content-transfer-encoding; bh=JZwOxDEE6FTlabp1Erp1Tw5Doyf6ABG7BAUGaMzAXBE=; b=n3yfRISiL7NMWNlXjtM+56EXcJApssI1b+rz0/YnhKXR9fp6xJQExtQrvKwmje9c7+ zxQJdzbo2rcRpaCjNr5UYpYkwd8FRe/Gnib9Y27v53voEcY0Dgc/Qim2IgTzG62RFxvA hqiWH5v7W/HpKS7ODeOG7x+xepP9SVuj30XyhvZ3mOS91+jiw/5RyYV1Z31yHq00M3O7 iHyb+1tm2UmUCrkCZwdE8/D0zEFnZJrTMcZ84yzlJLSDdsyZueUcmOtMtwHP3JpX28kj gGrv+N5lQLBO6TpOmdpz6Sbfp7lS8m6+8PcivYJPEMfvID1NFbQc5DOVyAG7G81kGoR5 qhNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:mime-version:content-transfer-encoding; bh=JZwOxDEE6FTlabp1Erp1Tw5Doyf6ABG7BAUGaMzAXBE=; b=ZyeZ1+ZeC5YPXDRnNAXAC4eTzicHz+g7aIqfzYlRBIp0ft9EdNu33jZnduDhOaLVws KkC9yC2j485xmOluSxS7Fszkfw1zyFSwUcEP32HLpTsSmDVegDsY+t2aL6NGHIBKGaiN ruB1imXL0Cyq06HJONzOU09cEA3nqq69A6IcWfxTSDDps+puvtmlVS9tG3OLkcW2E40V llbHy4uxVgqAMcGSJ1KbKTbZ53XsVIswTMG3Y9jv/pEMgiP6EpMDSK93au43QTsaGvOB 9drt5x7ZiwrHowu8Wa1ETv1hBgqlI3wZwJ4f4Wzl3bi3hiBKmeBRu3YkJVF+KZZKdQ41 iArQ== X-Gm-Message-State: ANhLgQ2BjUbJ5MBuzFwL3uHn2DWgJ9wceX5ci9pquPMEtLTsJyBGoFoW JtBjQLIeK8zJ0KFWumgJXrTt7g== X-Received: by 2002:a0c:bf05:: with SMTP id m5mr3287276qvi.26.1583506976090; Fri, 06 Mar 2020 07:02:56 -0800 (PST) Received: from dhcp-41-57.bos.redhat.com (nat-pool-bos-t.redhat.com. [66.187.233.206]) by smtp.gmail.com with ESMTPSA id 62sm14856664qkk.84.2020.03.06.07.02.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Mar 2020 07:02:55 -0800 (PST) Message-ID: <1583506974.7365.160.camel@lca.pw> Subject: Re: [PATCH] tick/sched: fix data races at tick_do_timer_cpu From: Qian Cai To: Thomas Gleixner Cc: fweisbec@gmail.com, mingo@kernel.org, elver@google.com, linux-kernel@vger.kernel.org, Peter Zijlstra Date: Fri, 06 Mar 2020 10:02:54 -0500 In-Reply-To: <1C65422C-FFA4-4651-893B-300FAF9C49DE@lca.pw> References: <87tv34laqq.fsf@nanos.tec.linutronix.de> <1C65422C-FFA4-4651-893B-300FAF9C49DE@lca.pw> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6 (3.22.6-10.el7) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2020-03-04 at 06:20 -0500, Qian Cai wrote: > > On Mar 4, 2020, at 4:39 AM, Thomas Gleixner wrote: > > > > They are reported, but are they actually a real problem? > > > > This completely lacks analysis why these 8 places need the > > READ/WRITE_ONCE() treatment at all and if so why the other 14 places > > accessing tick_do_timer_cpu are safe without it. > > > > I definitely appreciate the work done with KCSAN, but just making the > > tool shut up does not cut it. > > Looks at tick_sched_do_timer(), for example, > > if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) { > #ifdef CONFIG_NO_HZ_FULL > WARN_ON(tick_nohz_full_running); > #endif > tick_do_timer_cpu = cpu; > } > #endif > > /* Check, if the jiffies need an update */ > if (tick_do_timer_cpu == cpu) > tick_do_update_jiffies64(now); > > Could we rule out all compilers and archs will not optimize it if !CONFIG_NO_HZ_FULL to, > > if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE) || tick_do_timer_cpu == cpu) > tick_do_update_jiffies64(now); > > So it could save a branch or/and realized that tick_do_timer_cpu is not used later in this cpu, so it could discard the store? > > I am not all that familiar with all other 14 places if it is possible to happen concurrently as well, so I took a pragmatic approach that for now only deal with the places that KCSAN confirmed, and then look forward for an incremental approach if there are more places needs treatments later once confirmed. If you don't think that will be happening and dislike using READ/WRITE_ONCE() for documentation purpose, we could annotate those using the data_race() macro. Is that more acceptable? diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index a792d21cac64..08ce4088da87 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -129,7 +129,7 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)    * If nohz_full is enabled, this should not happen because the    * tick_do_timer_cpu never relinquishes.    */ - if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) { + if (unlikely(data_race(tick_do_timer_cpu == TICK_DO_TIMER_NONE))) {  #ifdef CONFIG_NO_HZ_FULL   WARN_ON(tick_nohz_full_running);  #endif @@ -138,7 +138,7 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)  #endif     /* Check, if the jiffies need an update */ - if (tick_do_timer_cpu == cpu) + if (data_race(tick_do_timer_cpu == cpu))   tick_do_update_jiffies64(now);     if (ts->inidle) @@ -737,8 +737,9 @@ static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu)    * Otherwise we can sleep as long as we want.    */   delta = timekeeping_max_deferment(); - if (cpu != tick_do_timer_cpu && -     (tick_do_timer_cpu != TICK_DO_TIMER_NONE || !ts->do_timer_last)) + if (data_race(cpu != tick_do_timer_cpu) && +     (data_race(tick_do_timer_cpu != TICK_DO_TIMER_NONE) || +     !ts->do_timer_last))   delta = KTIME_MAX;     /* Calculate the next expiry time */ @@ -771,10 +772,10 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)    * do_timer() never invoked. Keep track of the fact that it    * was the one which had the do_timer() duty last.    */ - if (cpu == tick_do_timer_cpu) { + if (data_race(cpu == tick_do_timer_cpu)) {   tick_do_timer_cpu = TICK_DO_TIMER_NONE;   ts->do_timer_last = 1; - } else if (tick_do_timer_cpu != TICK_DO_TIMER_NONE) { + } else if (data_race(tick_do_timer_cpu != TICK_DO_TIMER_NONE)) {   ts->do_timer_last = 0;   }