Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp615077ybh; Tue, 21 Jul 2020 03:50:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzqq9Xi+kQUL96zWAXov//aanyWJXqp+cq36eUWQCjklLrfcu8WO02yonJKqaz/ioYf94NM X-Received: by 2002:a50:e801:: with SMTP id e1mr24742249edn.251.1595328649235; Tue, 21 Jul 2020 03:50:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595328649; cv=none; d=google.com; s=arc-20160816; b=J7ksYEp/7RZnuTXeFfyeYCpFfemqyLYdWxkwrd50KC/ehA9gLBhshRaUP50nxGIaiD RocwxEv5EA636e9ZyUWYhD5bu5Iyxph6KjpwL7E+Pc/C2TQ5d65NX3yYSQJ85DTqkQ+L mMFrhbSAA0ySUyz8OROsNurgDxsS1Lt41S0tA3+AfgmEcVMVuZE7+uyUrpeHiffN+52x zti7IR69vRUX98VuLaNA01ym9mdBSgTBiOQAUyq2mL3ZCWh9a9kRPhwSfXf5n68CkMjs E5KDt4vv3sXfpoYIQc3cGOPvGA+cE4aJhOSa5/Z4+gzzQdQTNzEWoPzmCSN2TthDsPLO JF6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:date:to:cc:from :subject:references:in-reply-to:content-transfer-encoding :mime-version; bh=gpDev6k0SwYIoD0B1hDxU1ups4VSoRCRuYUVudGZGF8=; b=cApoZLul+yXK3jjeiDF1hPhTMdrcYnUp2+20fT5rlna23hzIm2EPWqU1Y9/D6sp31i XLeOoQ4YxGXZG2R2tgCVQbkF4b4KJbfRDIYQydyVwpP1ECcRkCEV84YvvgtRQ/Mn67Xf PADN/RwMNRpr8LbNBU37BORYU1xbw36UJDQKIY+w6cBQYUmcbHDHNUxcNTvRfWgzvsoO QZzoElKXDv9H8OtEai7Md96cxB7kNSydPZTWgVEXTwVnEpCX6C5X54zep25S048YJt1S gvJ2nPHfwlyJGZ9aAWojr+HvAxX1fE3SQan56AavU7TYzfFLHXsMhAuBKdo73+gf5jtN LAPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f13si12516938edy.576.2020.07.21.03.50.25; Tue, 21 Jul 2020 03:50:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728209AbgGUKuS convert rfc822-to-8bit (ORCPT + 99 others); Tue, 21 Jul 2020 06:50:18 -0400 Received: from mail.fireflyinternet.com ([77.68.26.236]:59897 "EHLO fireflyinternet.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726266AbgGUKuS (ORCPT ); Tue, 21 Jul 2020 06:50:18 -0400 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from localhost (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP (TLS) id 21883524-1500050 for multiple; Tue, 21 Jul 2020 11:49:06 +0100 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT In-Reply-To: <20200622100825.726200103@infradead.org> References: <20200622100122.477087977@infradead.org> <20200622100825.726200103@infradead.org> Subject: Re: [PATCH -v2 1/5] sched: Fix ttwu() race From: Chris Wilson Cc: linux-kernel@vger.kernel.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, paulmck@kernel.org, frederic@kernel.org, peterz@infradead.org, torvalds@linux-foundation.org, hch@lst.de To: Peter Zijlstra , mingo@kernel.org, tglx@linutronix.de Date: Tue, 21 Jul 2020 11:49:05 +0100 Message-ID: <159532854586.15672.5123219635720172265@build.alporthouse.com> User-Agent: alot/0.9 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Peter Zijlstra (2020-06-22 11:01:23) > @@ -2378,6 +2385,9 @@ static inline bool ttwu_queue_cond(int c > static bool ttwu_queue_wakelist(struct task_struct *p, int cpu, int wake_flags) > { > if (sched_feat(TTWU_QUEUE) && ttwu_queue_cond(cpu, wake_flags)) { > + if (WARN_ON_ONCE(cpu == smp_processor_id())) > + return false; > + > sched_clock_cpu(cpu); /* Sync clocks across CPUs */ > __ttwu_queue_wakelist(p, cpu, wake_flags); > return true; We've been hitting this warning frequently, but have never seen the rcu-torture-esque oops ourselves. <4> [181.766705] RIP: 0010:ttwu_queue_wakelist+0xbc/0xd0 <4> [181.766710] Code: 00 00 00 5b 5d 41 5c 41 5d c3 31 c0 5b 5d 41 5c 41 5d c3 31 c0 f6 c3 08 74 f2 48 c7 c2 00 ad 03 00 83 7c 11 40 01 77 e4 eb 80 <0f> 0b 31 c0 eb dc 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 bf 17 <4> [181.766726] RSP: 0018:ffffc90000003e08 EFLAGS: 00010046 <4> [181.766733] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: ffff888276a00000 <4> [181.766740] RDX: 000000000003ad00 RSI: ffffffff8232045b RDI: ffffffff8233103e <4> [181.766747] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 <4> [181.766754] R10: 00000000d3fa25c3 R11: 0000000053712267 R12: ffff88825b912940 <4> [181.766761] R13: 0000000000000000 R14: 0000000000000087 R15: 000000000003ad00 <4> [181.766769] FS: 0000000000000000(0000) GS:ffff888276a00000(0000) knlGS:0000000000000000 <4> [181.766777] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [181.766783] CR2: 000055b8245814e0 CR3: 0000000005610003 CR4: 00000000003606f0 <4> [181.766790] Call Trace: <4> [181.766794] <4> [181.766798] try_to_wake_up+0x21b/0x690 <4> [181.766805] autoremove_wake_function+0xc/0x50 <4> [181.766858] __i915_sw_fence_complete+0x1ee/0x250 [i915] <4> [181.766912] dma_i915_sw_fence_wake+0x2d/0x40 [i915] We are seeing this on the ttwu_queue() path, so with p->on_cpu=0, and the warning is cleared up by - if (WARN_ON_ONCE(cpu == smp_processor_id())) + if (WARN_ON_ONCE(p->on_cpu && cpu == smp_processor_id())) which would appear to restore the old behaviour for ttwu_queue() and seem to be consistent with the intent of this patch. Hopefully this helps identify the problem correctly. -Chris