Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp3092480rdb; Fri, 22 Sep 2023 19:15:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGRu8F9G3M82J173lwcY40kUyK0N0mzEgV8qDUcwGJpiqGiFLCOL9Fl7C6/MiTf+LnkwMxT X-Received: by 2002:a05:6a00:150f:b0:68b:bf33:2957 with SMTP id q15-20020a056a00150f00b0068bbf332957mr1034740pfu.22.1695435318335; Fri, 22 Sep 2023 19:15:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695435318; cv=none; d=google.com; s=arc-20160816; b=NzMSChn0rPFR62gafN15/1YgDMbSY4aDABeAoWAKDmGIbvq0arPBJ5uGOh9sArCu+m KScFDa2DofkO3+0Y+Qpfof7cUYCFhEOAap+wlTpXgZcuxnWBeZg9KUKNumA113JZM2VY 6pYuFxbjaw/Dj4WgPscDE0QHru+I4oAN1fciD5qergsDHA5PxfpzuKFYBF+dknA6mJj5 CJMXylofb5NiJTDZwBWD6KEuQIqTOgEroBMcBZTMdKyKVFI62ijHLyLDSwrVY6rvqDjR HCz41j5XsTR1oFmvNRqKPPs5jruYuqUpsjj98h6Vp7+ZPnRXgP7mhfmZrXLUL+T0PCCy 4c5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=3p2Xpy6OB/r139HgDed1FfmEjYksRL9s1sjY3anozVM=; fh=S0vnenfCBRAPdYrvtBaMgo4Rjpv2JgOr8ajd9O2jwBE=; b=kDwuI13ieU57442aa9nB90AS9J0OS+F5uGh525/YBrQvrSTQGej25bucBqGhh6uF+I +dOwgweNU+8uOfbchyHnNJFS5961+8VgrknMlNTE2T9mN4/cqfBRTyMds/4NX2C14hWF a41PjLf1P73+h2m2r274Tfve5Z9Amj+ByipAZooJr4P8lbg3CXGLpJxSdTmEy3RaS4eK Nfi8TXeGHtQkcnIxmhioXg4ugwkjDn2fS0TF/hWAm03+lmK+T3scfOa0PMtn+A4kmANE DxFGfIiakUwwdJ664EvijQ6Skl4hcqUvcWR9e6QdYvYJPztQ4AfQXOBtv5hGOpfcdPH6 Lvtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=Zx9vHbvf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id w191-20020a6382c8000000b00578bb5917d7si5236822pgd.153.2023.09.22.19.15.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 19:15:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=Zx9vHbvf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 458F688D6A31; Fri, 22 Sep 2023 18:14:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230520AbjIWBOW (ORCPT + 99 others); Fri, 22 Sep 2023 21:14:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230383AbjIWBOV (ORCPT ); Fri, 22 Sep 2023 21:14:21 -0400 Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4AA2CE for ; Fri, 22 Sep 2023 18:14:15 -0700 (PDT) Received: by mail-il1-x133.google.com with SMTP id e9e14a558f8ab-34f6ce577a4so9508375ab.3 for ; Fri, 22 Sep 2023 18:14:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; t=1695431654; x=1696036454; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=3p2Xpy6OB/r139HgDed1FfmEjYksRL9s1sjY3anozVM=; b=Zx9vHbvf85ZewJI63aHShFKxCduNHcBkaJOTiTDvUVrKbXoG4a7FzYS14YXt08iGhH jriK2XJhlQ/DvcHETiTHNYPc2uuQ7dvyrdf3dpIERCBCcQxomMYDxGtb1mOBdiN4TCM1 bkqedjmWj5YejyMQK5QULykOjkdSvvlR+66Dg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695431654; x=1696036454; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=3p2Xpy6OB/r139HgDed1FfmEjYksRL9s1sjY3anozVM=; b=KcCoNJtlm/Fe6dSs4GnrT1wMDI02ee+Sa4I7Sfge6Xbv9pcHg/A2qzWHvbbQQY8OOU f4+rBJXYIY4uL3fPtizoV9Ugfd6GkmlvBH6z/ywmH7PHrBaIJ45f2qBFW6wz4sI9lxZz dgcordaxRZ28BhpIomuzXKvuH/iTogXx+T3FaRT7FxL5y0hmRUNkcXYGPagOidkIszQl mFEmxdqRqvqSyXVvoCYmFNhdRa7VhClA7yU3d20D+LWNsbTWuv6ZbdJAxvPYHtvUuQdr oDVgnlx4mCWBv6ZnBRucDNLVtdB6z8bSpubOQHLvKyjHyNeA3WKm040KDMz9cbsit60u YSww== X-Gm-Message-State: AOJu0YwZ/iEND0VwKnRUje2/Z3pVu5j2Ume1vkXabrBbt8ZxmPbU/oNk Yb4sSiPok09N6POlfJPcAcCmhZSLp8vz2VuDXds= X-Received: by 2002:a05:6e02:2189:b0:34f:1e9c:45df with SMTP id j9-20020a056e02218900b0034f1e9c45dfmr1454696ila.4.1695431654589; Fri, 22 Sep 2023 18:14:14 -0700 (PDT) Received: from joelboxx5.c.googlers.com.com (156.190.123.34.bc.googleusercontent.com. [34.123.190.156]) by smtp.gmail.com with ESMTPSA id cg11-20020a0566381bcb00b0042b3dcb1106sm1330089jab.47.2023.09.22.18.14.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 18:14:13 -0700 (PDT) From: "Joel Fernandes (Google)" To: linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider Cc: "Joel Fernandes (Google)" , "Paul E . McKenney" , stable@vger.kernel.org Subject: [PATCH] sched/rt: Fix live lock between select_fallback_rq() and RT push Date: Sat, 23 Sep 2023 01:14:08 +0000 Message-ID: <20230923011409.3522762-1-joel@joelfernandes.org> X-Mailer: git-send-email 2.42.0.515.g380fc7ccd1-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Fri, 22 Sep 2023 18:14:27 -0700 (PDT) During RCU-boost testing with the TREE03 rcutorture config, I found that after a few hours, the machine locks up. On tracing, I found that there is a live lock happening between 2 CPUs. One CPU has an RT task running, while another CPU is being offlined which also has an RT task running. During this offlining, all threads are migrated. The migration thread is repeatedly scheduled to migrate actively running tasks on the CPU being offlined. This results in a live lock because select_fallback_rq() keeps picking the CPU that an RT task is already running on only to get pushed back to the CPU being offlined. It is anyway pointless to pick CPUs for pushing tasks to if they are being offlined only to get migrated away to somewhere else. This could also add unwanted latency to this task. Fix these issues by not selecting CPUs in RT if they are not 'active' for scheduling, using the cpu_active_mask. Other parts in core.c already use cpu_active_mask to prevent tasks from being put on CPUs going offline. Tested-by: Paul E. McKenney Cc: stable@vger.kernel.org Signed-off-by: Joel Fernandes (Google) --- kernel/sched/cpupri.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c index a286e726eb4b..42c40cfdf836 100644 --- a/kernel/sched/cpupri.c +++ b/kernel/sched/cpupri.c @@ -101,6 +101,7 @@ static inline int __cpupri_find(struct cpupri *cp, struct task_struct *p, if (lowest_mask) { cpumask_and(lowest_mask, &p->cpus_mask, vec->mask); + cpumask_and(lowest_mask, lowest_mask, cpu_active_mask); /* * We have to ensure that we have at least one bit -- 2.42.0.515.g380fc7ccd1-goog