Received: by 2002:a05:7412:3b8b:b0:fc:a2b0:25d7 with SMTP id nd11csp2675075rdb; Mon, 12 Feb 2024 12:29:21 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWbIO34QMo+ORp6u+Xx7ZHzFTWhU11ztTHd6e3BCwBIUfAO7Wnxgis0YUD1U5Kan7KL1/ZyoaAO6mf7orl7bZNJL6QJYkmNpUl/CCrEbA== X-Google-Smtp-Source: AGHT+IETtxwaIO/Y5hyyux9V1/wuWmVWLTNv/66ZO3pmQRoTonpRiC2Qzil1ZzKaWfotpocTZlYu X-Received: by 2002:ac8:5bd5:0:b0:42c:7594:41ae with SMTP id b21-20020ac85bd5000000b0042c759441aemr1013178qtb.9.1707769761009; Mon, 12 Feb 2024 12:29:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707769760; cv=pass; d=google.com; s=arc-20160816; b=OouiHP20NYrSY9VW/IQvS2/aKO5UcC9E1SFsssSdLiKoOJwIRX/sZy/92DEiysfLg6 RZ5M3JDKZGXFIZA1L/GnGBs9zBfNN+7ucN6ak/R/v84SqxEJEztHXMFUbvjTShiQnkvf Co1GlSTdAGaV1oGJ1sX1soJUhM0CKa0flpAI8JzaEGWjL0A5iuIY5CcbnZmQktV2Ow8F 0g+ye/XZ1izppJSStHmzgNtX4tgH+dB+u17hXSjSRwo/pwCUhRc0t2LkU+4bi5XC0jfq s9QAoeM0Tjkzube662M2jVH4sb/sRwyXtgCq0kJjsGsROQ1HoercciM/Ya3bywkJjhpQ 2zKA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=OGH4gStnAh2cVluN25q7Y7nHy88Q9ENXq3FqeNVOM/s=; fh=cMdxse6qTLJFXntlNVQHHojleEsVlHgciJNubVplB3Y=; b=mtg/Zafk2XP+bm0oGI3XuaezBNI6T+t2e3rPaYfCuWVc9FXnkTUIzcpSdpzYQHGDmM c/+iGZfVTotgKWqFYUEQCw7NQ6r0iavzSVdesoy9HNMfg2O1EYz+k5meXX1phoRBCpHC C138YngUfS2v7czjXJQLjBx02bRLjiKrU0e8uNELv7uWmhwq0y1ugNj1lWuC4woVuCbL Rg/iA6Kd9Pc3LZW5qthsfNqviNICx7rVi6u+aqUrVgEh4KlVtMtsmDEgdngzSk/uMm7R ZSz+TVh2Dm5i1nLCL0FmnYw/BHBzlIlmfqGQRx7WbA9jCE2qnpdQ5I7edElU/hAoJ79J /TPw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=FPVamoda; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-62335-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-62335-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com X-Forwarded-Encrypted: i=2; AJvYcCVUrvTbv4H2M38XraJ3BXzMGnCNW36pl5mKRliAXbR5L+VJNk4lv21EhqtilyPiHzA+O96FAEF20kkU+G4I2qSuy7dEoS6Up7Cu25Repg== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id g8-20020ac84688000000b0042c3325e600si1136941qto.745.2024.02.12.12.29.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Feb 2024 12:29:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-62335-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=FPVamoda; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-62335-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-62335-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id AE8751C21DAD for ; Mon, 12 Feb 2024 20:29:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 33C17482DA; Mon, 12 Feb 2024 20:29:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="FPVamoda" Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A11A846435 for ; Mon, 12 Feb 2024 20:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.52 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707769752; cv=none; b=nc5Q+k6/dTY85P+nNd8Rzl/qcUA24ex6i5w+HhjijMX6UIoK+yt8aCiybtAZ3lmcpShUFFLzikO9HIJKq71UkiKLEKu5yIQt+TsUXLY3S2utZcku3Fw+qY9iZ1fDoa87O8JjeXAYiYjQ9j9hyGF+8arPzWvMxY34MAIcIPCMsqM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707769752; c=relaxed/simple; bh=4DG/2AulWdm1bfaag4echIs9kSH2jj0kQ6qcGW9ZFNI=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=JxYrQegrqgUGgf/RWNe8NReFq7hODOkCnONShMXVZg8xP5jDzddkgOtFMjsbMNTev2I/rBDKheIZIEdpfJyqRuAYQYv4pubY6EEWWtx8VuCvPxheqWAu7xr61/s4YYCgsdmKd5N1TS46c0wbaryi947n79eL7jWegNQae0HVMds= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=FPVamoda; arc=none smtp.client-ip=209.85.208.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-560530f4e21so2815a12.1 for ; Mon, 12 Feb 2024 12:29:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707769749; x=1708374549; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=OGH4gStnAh2cVluN25q7Y7nHy88Q9ENXq3FqeNVOM/s=; b=FPVamodaZe9hN1dQI9wgE0+u2VWM4vC4Wopi+kKYErkIx/NvB0F+QYpwik5FcC+U6t itwG93iP7mXIw5M3gvtcH5y9UK/Y7i9TuN5O20bOAlomWW38GMjeBP2l6Fq69D0Zk3+Z IiINFOOYYP2/9D1hTrrsmgGRjHGs6QW17dFURb/iqCZAdSakBZQ/XhkGF2fwXAsMqNK9 I1vBVkquMRUkJiyf/9DkqptglzPRmp2TXcTt9P7kUS80oq7+85JONT2GpWjPblRw2U7R 8Kp4ndITC/euRF2dUI6OJTM3QFtG8S9MXIE9E4oSqNjIYix4omFIGHzzTY9m4tmmsD+D 6RiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707769749; x=1708374549; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OGH4gStnAh2cVluN25q7Y7nHy88Q9ENXq3FqeNVOM/s=; b=EO2kipIBNmmxbpUPcNk48TJIaHFA76CtQ9gq4mEwWMWFeZcSaO/aKIwVQjpONBaJBO ZSrhZB6IoF9UMtBJHdHj3uICrDOdT90c0De+Ft35QFokjMM88NVVEv78rx8NWSd5G+pn IvZXmbzfHkooNdSvlEuEyZZOM4QAfYOF78N+MrAK9UM1VOhukjm7tHdGKSAB3bqb1uaU +rPEe10w1vphxC6bqcMYIhSmCm34Yju7SjcEZWmOXfIP7i/zENbSOzu14w/EtgJWzYTG 3IpaXqrTeTH3S5TBUSH+1vrBcRzQJWxZJd2wJm6TuwoK1iT4xqaJrZGfUSb41XVccFic ihjA== X-Forwarded-Encrypted: i=1; AJvYcCWi6m/zljE7iHkKUSelJrtqgwkoOglhAx/Ibi2y/0p1xq509sf7k/uSqv/CeBkTcUAfjSLVGYCYmke9r0cpX98EToc1VscgK3tmy2Vr X-Gm-Message-State: AOJu0YxxowGtWVKPtLZ0VyjTRLefZrT8h0fXHVnLlRTb4b8yWgjdLjDN IIPY7w+3r1a/HK1QPfGuijTkQIJ+2cYRFcQljcefqnaSUhIl1BQ8QO5/T15tBU8Lh8YpSPjVKvi R2hcZG8PSyFQD94siybXrnhv9YllFwhD5eXpg X-Received: by 2002:a50:8a9e:0:b0:560:f37e:2d5d with SMTP id j30-20020a508a9e000000b00560f37e2d5dmr295831edj.5.1707769748536; Mon, 12 Feb 2024 12:29:08 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20220825122726.20819-1-vincent.guittot@linaro.org> <20220825122726.20819-2-vincent.guittot@linaro.org> In-Reply-To: <20220825122726.20819-2-vincent.guittot@linaro.org> From: Josh Don Date: Mon, 12 Feb 2024 12:28:55 -0800 Message-ID: Subject: Re: [PATCH 1/4] sched/fair: make sure to try to detach at least one movable task To: Vincent Guittot Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, zhangqiao22@huawei.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Vincent, On Thu, Aug 25, 2022 at 5:27=E2=80=AFAM Vincent Guittot wrote: > > During load balance, we try at most env->loop_max time to move a task. > But it can happen that the loop_max LRU tasks (ie tail of > the cfs_tasks list) can't be moved to dst_cpu because of affinity. > In this case, loop in the list until we found at least one. We had a user recently trigger a hard lockup which we believe is due to this patch. The user in question had O(10k) threads affinitized to a cpu; seems like the process had an out of control thread spawning issue, and was in the middle of getting killed. However, that was being slowed down due to the fact that load balance was iterating all these threads and bouncing the rq lock (and making no progress due to ALL_PINNED). Before this patch, load balance would quit after hitting loop_max. Even ignoring that specific instance, it seems pretty easy for this patch to cause a softlockup due to a buggy or malicious process. For the tradeoff you were trying to make in this patch (spend more time searching in the hopes that there's something migratable further in the list), perhaps it would be better to adjust sysctl.sched_nr_migrate instead of baking this into the kernel? Best, Josh > > The maximum of detached tasks remained the same as before. > > Signed-off-by: Vincent Guittot > --- > kernel/sched/fair.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index da388657d5ac..02b7b808e186 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8052,8 +8052,12 @@ static int detach_tasks(struct lb_env *env) > p =3D list_last_entry(tasks, struct task_struct, se.group= _node); > > env->loop++; > - /* We've more or less seen every task there is, call it q= uits */ > - if (env->loop > env->loop_max) > + /* > + * We've more or less seen every task there is, call it q= uits > + * unless we haven't found any movable task yet. > + */ > + if (env->loop > env->loop_max && > + !(env->flags & LBF_ALL_PINNED)) > break; > > /* take a breather every nr_migrate tasks */ > @@ -10182,7 +10186,9 @@ static int load_balance(int this_cpu, struct rq *= this_rq, > > if (env.flags & LBF_NEED_BREAK) { > env.flags &=3D ~LBF_NEED_BREAK; > - goto more_balance; > + /* Stop if we tried all running tasks */ > + if (env.loop < busiest->nr_running) > + goto more_balance; > } > > /* > -- > 2.17.1 >