Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp848888ybz; Wed, 15 Apr 2020 20:42:03 -0700 (PDT) X-Google-Smtp-Source: APiQypLbRboLgSXj8qG2UgKRHl26nXqlj5k3SFmqD29Y69+RezRO4lC2QOL43wjZZtanoBAq7OCj X-Received: by 2002:a50:ce58:: with SMTP id k24mr27827578edj.218.1587008522931; Wed, 15 Apr 2020 20:42:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587008522; cv=none; d=google.com; s=arc-20160816; b=PWXD1SVymzDpeiIkwcI4RFDBJLJ/Sf7gzBRCw6a2KC9BYbLv8Yyo4BSRZDY43y7cWT ddaU7+xt8PZA9X7RY4DtWGpXXMWeT9j5zHO48sExiJVMXX849erRNHWnqUEidY5tENkL JxLO1d4jTSXQPGty/EOMCL6YjqIPYiAGXVp8n468P6B1xGXDXgQB6UAeIX3YTY8Nbkk9 qwcfCgH3/cXs/VC4C58wE6gPVnYMqIC4yMUIdWliDC5C1sbSiAP0hkJirZvy7YQjIjSX 6hLNELRpGuy+RgFSjkohCnQmQpc1ZEEe9W+pOvLNI1X5sEnX3iPf5Yikx0tZBY7Lnds0 YxWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=m5hy4xiunR5mbPaXPcryFldNP/am+ULUKmnq7qSCn+g=; b=geXDjQrhvE+Z05IG5ZXmo30ujYwT3CzPqGFxQTxFUna2A1vbhGcUfiximxDQosB5qO eppgbgE9KEFRdNoTQn56MbEnNhLRheyQuv/5eZJ5TNmdElq7wE+tSFldX/CHmEvWis+d sB761jBmasPdtX/dISzIKUO0ul0OmM+6J+A3w0/5fyQVNOhCV6KTXJdlWiMstNI9Gbbg +wB6eTbahRKAF2alzpDCy1vIj1ch8HfOphmcVclFo/YURuDTnlL6+b4ABkxEiO7A2xZE n9LLMJpSynLX5NTPgMBXHpFZcgo4If2PL27C06JZB3SW7G2IKQS18deDjGsMrSrqxps4 /7NA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rRAYu5sO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lw23si1647706ejb.423.2020.04.15.20.41.39; Wed, 15 Apr 2020 20:42:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rRAYu5sO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391351AbgDPDkV (ORCPT + 99 others); Wed, 15 Apr 2020 23:40:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S2391128AbgDPDkQ (ORCPT ); Wed, 15 Apr 2020 23:40:16 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C190BC061A0C for ; Wed, 15 Apr 2020 20:40:15 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id d1so1067814pfh.1 for ; Wed, 15 Apr 2020 20:40:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=m5hy4xiunR5mbPaXPcryFldNP/am+ULUKmnq7qSCn+g=; b=rRAYu5sOSvLYPzxWffYX0ycmRMPHrLZ5lHiPZ9pcq5k+rdLLWRssm+fW56CIpLszLi ayhJb3kkkobrDJL0gj6BQPMIiUGgAOTvKuc/H73Fz+snsiIfonSV/OXB0V8U9sk6QOkv /p12BI4vz2UqPIWf3UqKSnzcw0Y8mGgWolafGDKFe+XkKlPjfDhny1gjuF/sersF5lU0 DjbA3gq2I32tbz+foeXF+ngk7hNDyaCWA5JBNteUCMc8e/t6Sk5Qs7uKsf5jfTyyDNk8 DAoMkWhh+s+58/1IdngL+qSc07/UWMLfgxDOuWUtxu5DylkOHQb8LZQi4qAmBtx5TrYE 9zjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=m5hy4xiunR5mbPaXPcryFldNP/am+ULUKmnq7qSCn+g=; b=etMvg32iA/VRQ0RcNweAGO8eRAGA2AnQtGroIsKXnw3489YMHGK9Glevc2bPisFQf/ AEs09HFZZfKEdH0DYcfpESHv6wqRLJHBbpHLrWlpmEvDTaqMneVLmjZVg/TlrBEIhB5o I+sU8VqC5qPKLDgBSoXo40cf45VWESeQ0lbgwLfru2X39bIJLXoF2oWQKdZuaPPZrGvY yiQqpN6w5/ZG4tHyxJ3wLET+niaU2FYq3Q2PNQd7DLUB+I9GwFLejPkj58guWLM5vhmo NVUy/HN6oOiD1OZRZJvGjHyZYntYb03LQax0Ljtb08aRCNOQdj5gwUXFpS5w6bRnEjo3 SHCg== X-Gm-Message-State: AGi0PubZVDtYnIN+Ur1RUeloK+0xPGJReovNEj65dqDQnnAFXI+ASVoS p4I1Ps0g5exdkLvuBaevA4k= X-Received: by 2002:a63:4526:: with SMTP id s38mr6778872pga.410.1587008415303; Wed, 15 Apr 2020 20:40:15 -0700 (PDT) Received: from HP-G1 ([45.135.186.26]) by smtp.gmail.com with ESMTPSA id a12sm8840158pfr.28.2020.04.15.20.40.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2020 20:40:14 -0700 (PDT) Date: Thu, 16 Apr 2020 11:39:05 +0800 From: Chen Yu To: vpillai Cc: Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , aubrey.li@linux.intel.com, Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , Joel Fernandes , joel@joelfernandes.org, Aaron Lu , Long Cui Subject: Re: [RFC PATCH 07/13] sched: Add core wide task selection and scheduling. Message-ID: <20200416033804.GA5712@HP-G1> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 04, 2020 at 04:59:57PM +0000, vpillai wrote: > From: Peter Zijlstra > > Instead of only selecting a local task, select a task for all SMT > siblings for every reschedule on the core (irrespective which logical > CPU does the reschedule). > > There could be races in core scheduler where a CPU is trying to pick > a task for its sibling in core scheduler, when that CPU has just been > offlined. We should not schedule any tasks on the CPU in this case. > Return an idle task in pick_next_task for this situation. > > NOTE: there is still potential for siblings rivalry. > NOTE: this is far too complicated; but thus far I've failed to > simplify it further. > > Signed-off-by: Peter Zijlstra (Intel) > Signed-off-by: Julien Desfossez > Signed-off-by: Vineeth Remanan Pillai > Signed-off-by: Aaron Lu > Signed-off-by: Tim Chen > --- [cut] Hi Vineeth, An NULL pointer exception was found when testing V5 on top of stable v5.6.2. And we tried the patch Peter suggested, the NULL pointer was not found so far. We don't know if this change would help mitigate the symptom, but it should do no harm to test with this fix applied. Thanks, Chenyu From 6828eaf4611eeb3e1bad3b9a0d4ec53c6fa01fe3 Mon Sep 17 00:00:00 2001 From: Chen Yu Date: Thu, 16 Apr 2020 10:51:07 +0800 Subject: [PATCH] sched: Fix pick_next_task() race condition in core scheduling As Perter mentioned that Commit 6e2df0581f56 ("sched: Fix pick_next_task() vs 'change' pattern race") has fixed a race condition due to rq->lock improperly released after put_prev_task(), backport this fix to core scheduling's pick_next_task() as well. Without this fix, Aubrey, Long and I found an NULL exception point triggered within one hour when running RDT MBA(Intel Resource Directory Technolodge Memory Bandwidth Allocation) benchmarks on a 36 Core(72 HTs) platform, which tries to dereference a NULL sched_entity: [ 3618.429053] BUG: kernel NULL pointer dereference, address: 0000000000000160 [ 3618.429039] RIP: 0010:pick_task_fair+0x2e/0xa0 [ 3618.429042] RSP: 0018:ffffc90000317da8 EFLAGS: 00010046 [ 3618.429044] RAX: 0000000000000000 RBX: ffff88afdf4ad100 RCX: 0000000000000001 [ 3618.429045] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88afdf4ad100 [ 3618.429045] RBP: ffffc90000317dc0 R08: 0000000000000048 R09: 0100000000100000 [ 3618.429046] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 3618.429047] R13: 000000000002d080 R14: ffff88afdf4ad080 R15: 0000000000000014 [ 3618.429048] ? pick_task_fair+0x48/0xa0 [ 3618.429048] pick_next_task+0x34c/0x7e0 [ 3618.429049] ? tick_program_event+0x44/0x70 [ 3618.429049] __schedule+0xee/0x5d0 [ 3618.429050] schedule_idle+0x2c/0x40 [ 3618.429051] do_idle+0x175/0x280 [ 3618.429051] cpu_startup_entry+0x1d/0x30 [ 3618.429052] start_secondary+0x169/0x1c0 [ 3618.429052] secondary_startup_64+0xa4/0xb0 While with this patch applied, no NULL pointer exception was found within 14 hours for now. Although there's no direct evidence this fix would solve the issue, it does fix a potential race condition. Signed-off-by: Chen Yu --- kernel/sched/core.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 02495d44870f..ef101a3ef583 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4477,9 +4477,14 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) return next; } - prev->sched_class->put_prev_task(rq, prev); - if (!rq->nr_running) - newidle_balance(rq, rf); + +#ifdef CONFIG_SMP + for_class_range(class, prev->sched_class, &idle_sched_class) { + if (class->balance(rq, prev, rf)) + break; + } +#endif + put_prev_task(rq, prev); smt_mask = cpu_smt_mask(cpu); -- 2.20.1