Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp2151620pxa; Mon, 24 Aug 2020 06:42:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyqyrG6VmM9EAfIzlKrkmhedRnELg4ijAU7uwBgUA7os4o2Xht1inPx3y9xrVwUFuQiT04w X-Received: by 2002:a05:6402:1ac8:: with SMTP id ba8mr5216817edb.316.1598276521964; Mon, 24 Aug 2020 06:42:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598276521; cv=none; d=google.com; s=arc-20160816; b=VuECXzgX8SL5OUKnT6X6uzbGNAKAtdy0hClXY9utTxH01BrAk5D+wYCwFw2a/HPHwZ hp2zXx9ykvM+hTRMoiQWe98ZQCdemUoOTLDYmEuL8vGsXkUOB7vuySGXAIg8Ot7ADnVY ZyRypgx0ow8BbGk1YBoomO5+GI4fI/EptM8muVtAKQo1Ek3E1b5i0e6CFVMJyHm/BVUD XWLV3y+C+Ej09TPq2e4lJ/KIqnoOIx5tb1IjhllWhf8KePEiJMaT4HBNSL9M5yG9PWSv iLFkXaKxRKYSN5if1fJhk+y/V38HlzdbLncjOk/oYRPRpTSFT0hEEBxM7639DTw+wmRf WMoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=J6snxcYbQgYaI1dLzLSaJ+5USc2AXUnB6IShN2cfXmU=; b=qOevIN5iaoD2JKf1rA46Lu5p7JFcoeQ03zDcBP4DBkWn8n+ShcShe4bS03dDUnKJtC uSMvHqdJI0Md7PudD8vu6VXOn+/3e6DuZCwNZEgdScbSMkiLuluDlr/baV+rpPEPqGdr eb0oHID5bWEB5yClFDgxCLOEBMyE3V5tRrXKZqL/yQ1+3N81ubh3ZIfkmsYu3HvKTNTT MxlDmcVOV/kJAH4WQRsVH3FCmojiXuBI/65Hn3RP/1R9Jp/epo4wiMTVF6jMDoIvVNgc gO5lOVJw1omef2oqWowspdor2BlNu14xKdkg12lT6qIHrcciDpuN9qBtjKsmYe7mDbdF dAew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=jC40hn4m; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id jo14si7283232ejb.639.2020.08.24.06.41.38; Mon, 24 Aug 2020 06:42:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=jC40hn4m; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727905AbgHXNk1 (ORCPT + 99 others); Mon, 24 Aug 2020 09:40:27 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:13596 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727859AbgHXNik (ORCPT ); Mon, 24 Aug 2020 09:38:40 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07ODWAun037927; Mon, 24 Aug 2020 09:38:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : reply-to : references : mime-version : content-type : in-reply-to; s=pp1; bh=J6snxcYbQgYaI1dLzLSaJ+5USc2AXUnB6IShN2cfXmU=; b=jC40hn4mTYC5ra28Pn3LuGL/MNVFMZ3H3vO9T+UpSly+BYV7i/QrGwihLczp09CkrloW 84DbDqNjdPUBHKoo666FdNfe4TdZu/AxXBbSsxNlPIrURWlFeOVMThj7jH1P7LMCDPZT il6Bpep98ZARYgKOHj/4cZqUtw3DVNjhaD/SEirzd5FUIud6QGA3t7+nXm+yCyWa2uj1 E1KTnE767U9PkUz+16bQouc0lARrQLAebfJz2ghhGQ00Rw1CU8VwvnxHNbS9nXo3Hbfb /N5QSPff02znXT7wsGGpIoeG0uTISjyLnaT5OJhgv12x/42Km00ahtmS+6pLYooWfqdc tw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 334eamgmcw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Aug 2020 09:38:27 -0400 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 07ODWGbi038571; Mon, 24 Aug 2020 09:38:27 -0400 Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0b-001b2d01.pphosted.com with ESMTP id 334eamgmbt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Aug 2020 09:38:27 -0400 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 07ODbKH6008980; Mon, 24 Aug 2020 13:38:25 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma04fra.de.ibm.com with ESMTP id 332ujjsfj9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Aug 2020 13:38:25 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 07ODcMhu25756104 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Aug 2020 13:38:22 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B46194C04A; Mon, 24 Aug 2020 13:38:22 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4376A4C040; Mon, 24 Aug 2020 13:38:21 +0000 (GMT) Received: from linux.vnet.ibm.com (unknown [9.126.150.29]) by d06av22.portsmouth.uk.ibm.com (Postfix) with SMTP; Mon, 24 Aug 2020 13:38:21 +0000 (GMT) Date: Mon, 24 Aug 2020 19:08:20 +0530 From: Srikar Dronamraju To: Xunlei Pang Cc: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Juri Lelli , Wetp Zhang , linux-kernel@vger.kernel.org Subject: Re: [PATCH] sched/fair: Fix wrong cpu selecting from isolated domain Message-ID: <20200824133820.GA31355@linux.vnet.ibm.com> Reply-To: Srikar Dronamraju References: <1598272219-43040-1-git-send-email-xlpang@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1598272219-43040-1-git-send-email-xlpang@linux.alibaba.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235,18.0.687 definitions=2020-08-24_12:2020-08-24,2020-08-24 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 mlxlogscore=999 clxscore=1015 mlxscore=0 phishscore=0 spamscore=0 suspectscore=0 bulkscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008240104 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Xunlei Pang [2020-08-24 20:30:19]: > We've met problems that occasionally tasks with full cpumask > (e.g. by putting it into a cpuset or setting to full affinity) > were migrated to our isolated cpus in production environment. > > After some analysis, we found that it is due to the current > select_idle_smt() not considering the sched_domain mask. > > Fix it by checking the valid domain mask in select_idle_smt(). > > Fixes: 10e2f1acd010 ("sched/core: Rewrite and improve select_idle_siblings()) > Reported-by: Wetp Zhang > Signed-off-by: Xunlei Pang > --- > kernel/sched/fair.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 1a68a05..fa942c4 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6075,7 +6075,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int > /* > * Scan the local SMT mask for idle CPUs. > */ > -static int select_idle_smt(struct task_struct *p, int target) > +static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) > { > int cpu; > > @@ -6083,7 +6083,8 @@ static int select_idle_smt(struct task_struct *p, int target) > return -1; > > for_each_cpu(cpu, cpu_smt_mask(target)) { > - if (!cpumask_test_cpu(cpu, p->cpus_ptr)) > + if (!cpumask_test_cpu(cpu, p->cpus_ptr) || > + !cpumask_test_cpu(cpu, sched_domain_span(sd))) > continue; Don't think this is right thing to do. What if this task had set a cpumask that doesn't cover all the cpus in this sched_domain_span(sd) cpu_smt_mask(target) would already limit to the sched_domain_span(sd) so I am not sure how this can help? -- Thanks and Regards Srikar Dronamraju