Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp4747018ybf; Wed, 4 Mar 2020 09:52:19 -0800 (PST) X-Google-Smtp-Source: ADFU+vvEv8/jjC/65ezkjiEM79Je3+CoWqeUlr2GRpiWuhms7jkKmgEYZLF394iDJu1FnaykMbR1 X-Received: by 2002:a05:6808:45:: with SMTP id v5mr2498386oic.90.1583344339608; Wed, 04 Mar 2020 09:52:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583344339; cv=none; d=google.com; s=arc-20160816; b=1AzG9FKIQnKAjSTYh3k7dA20jD6Usp8R3PttKoqoC2ke/iugwv5sUuevTGeddUHZ7q /p6uq8V7hpPP2mYbiIq+iK4SmbE+wO5+eQME6kVIayR0KddRHw6WUeSScVDDlea1dxpN ZJDfjEPNMfW5I7IO/K90TYgx9IM+RWpMkKMPor6/SUBkf0NRZXoDihuDuzlveCvCIOzi cmMy0nQh7IY/e7TS3eN43rFe3d85kF/2/HZDaS+bdtBIUf/vVALHZHBcHyxgglxm9YUT FIyDr7n5kMwO2EuJ0Oj2mjS2X7kcUJM6eZwwvbWMotpRqbL4boNc2j/p6SjhJTkI1gl/ 31Zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=gNmYTdVs4qTjcpOHaFCDG9lMzg7DV7Ms/v+9XIwuIGE=; b=NPm6PCyPiMzBJSODBCRt/6l7VVG0ElMDDyv+QL4QVudyK+F4+b9YaMlyZir5qXGeh4 Mc3tDNoBK8j5G6SGxFDfHS9Q/NEIwAIisNhfaOL3veOSl+YNSiEi2knYRGA2HjTcoOFc 72TDu732K7YWmyBheak5Ptvn+Wy+xU1GbyaSC+5cYrJiMFk7Im2kzE1wGuw6eSjmeE7k FudEtmhtE96E+RFeeAw56F/G2yHgHOUwTZYVD/zyBm7IOmPXfDkbPNI64JrUGEXvrzDB ttyIwmFAy4LLs/Jy6cMkzRrPCUciycdKXMnca5uUPd14GLq0mmeEJW8sZ8F+1MimED8P Jx2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="FMlNxvw/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p26si1462117oto.240.2020.03.04.09.52.07; Wed, 04 Mar 2020 09:52:19 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="FMlNxvw/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730063AbgCDRv4 (ORCPT + 99 others); Wed, 4 Mar 2020 12:51:56 -0500 Received: from mail-lj1-f195.google.com ([209.85.208.195]:45760 "EHLO mail-lj1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730022AbgCDRv4 (ORCPT ); Wed, 4 Mar 2020 12:51:56 -0500 Received: by mail-lj1-f195.google.com with SMTP id e18so3000794ljn.12 for ; Wed, 04 Mar 2020 09:51:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gNmYTdVs4qTjcpOHaFCDG9lMzg7DV7Ms/v+9XIwuIGE=; b=FMlNxvw/mFoSTw4SFRsvhdq3n0NJd7joqeCVB8hSW9MH702maqdyhG7vBvZoPpQEKZ FxN77FB5CXo1240CRzIeXmjrsKrBvGKUK+y1DPbYyaBUS4NZu0C8UgTUpHUQpyYDxyIO 6s5anltgnr5LLC32sSdbs+3EyFHfdcpKecr/EkQLzhYbe7sb/+Otne90NvP4tl1tcjbN ecvP5dVsm8y5PeHltE5m8Vv94538RDX5+wnTpfhvjIc07xOz6LtnQpXdeqXga5USobwT ffyiLCmrSEpQop7/sY3mwMUgjTqNQozipfI8mOxfYI0J6gE8UUITGqozsYTBDV4VYf2y UIaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gNmYTdVs4qTjcpOHaFCDG9lMzg7DV7Ms/v+9XIwuIGE=; b=Xy14ItDMZv37gmUZkr5+W9ecs6NWYrWFFM+paSejYW1RNxq+VukQDmsd09zx//JwJh iE7x40qWIhYqS3/nmowNmkmS1ZnVab2d4ar2u+nf98PAKjQrDbNCM+LusdFeAxyJRyiu qofuuZKFLkCpci+iM1uqWMXyP/mH9SPWG/hQKX2D+EQ9PS/pueG3z+qvWwduBDkFzoQ4 XPDEdF6W9JovckX7Jt2t/KGpUcjkB803k+oY5QfanlemzhhtXZK3PRlPV/PMcvqc5m7y tEGShsDi+OGp6x7qN/Os+4rl7iagM2UX/WNQwCE4Kvj1K7rkRW3HbGpiS6WJf/mLbmFR NLIg== X-Gm-Message-State: ANhLgQ2JtdRT9jqVskaf3SVoxWtPEzGW4dvS5czTxtFVzHOHaNguli4b rJQE6vOJ//PzsmYvSw+eGOT8A9U9jyMjtab26XnwMQ== X-Received: by 2002:a2e:5850:: with SMTP id x16mr2498122ljd.209.1583344313497; Wed, 04 Mar 2020 09:51:53 -0800 (PST) MIME-Version: 1.0 References: <1a607a98-f12a-77bd-2062-c3e599614331@de.ibm.com> <20200228163545.GA18662@vingu-book> <49a2ebb7-c80b-9e2b-4482-7f9ff938417d@de.ibm.com> <2108173c-beaa-6b84-1bc3-8f575fb95954@de.ibm.com> In-Reply-To: <2108173c-beaa-6b84-1bc3-8f575fb95954@de.ibm.com> From: Vincent Guittot Date: Wed, 4 Mar 2020 18:51:42 +0100 Message-ID: Subject: Re: 5.6-rc3: WARNING: CPU: 48 PID: 17435 at kernel/sched/fair.c:380 enqueue_task_fair+0x328/0x440 To: Christian Borntraeger Cc: Ingo Molnar , Peter Zijlstra , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 4 Mar 2020 at 18:42, Christian Borntraeger wrote: > > > > On 04.03.20 16:26, Vincent Guittot wrote: > > On Tue, 3 Mar 2020 at 08:55, Vincent Guittot wrote: > >> > >> On Tue, 3 Mar 2020 at 08:37, Christian Borntraeger > >> wrote: > >>> > >>> > >>> > > [...] > >>>>>> --- > >>>>>> kernel/sched/fair.c | 2 +- > >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >>>>>> index 3c8a379c357e..beb773c23e7d 100644 > >>>>>> --- a/kernel/sched/fair.c > >>>>>> +++ b/kernel/sched/fair.c > >>>>>> @@ -4035,8 +4035,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) > >>>>>> __enqueue_entity(cfs_rq, se); > >>>>>> se->on_rq = 1; > >>>>>> > >>>>>> + list_add_leaf_cfs_rq(cfs_rq); > >>>>>> if (cfs_rq->nr_running == 1) { > >>>>>> - list_add_leaf_cfs_rq(cfs_rq); > >>>>>> check_enqueue_throttle(cfs_rq); > >>>>>> } > >>>>>> } > >>>>> > >>>>> Now running for 3 hours. I have not seen the issue yet. I can tell tomorrow if this fixes > >>>>> the issue. > >>>> > >>>> > >>>> Still running fine. I can tell for sure tomorrow, but I have the impression that this makes the > >>>> WARN_ON go away. > >>> > >>> So I guess this change "fixed" the issue. If you want me to test additional patches, let me know. > >> > >> Thanks for the test. For now, I don't have any other patch to test. I > >> have to look more deeply how the situation happens. > >> I will let you know if I have other patch to test > > > > So I haven't been able to figure out how we reach this situation yet. > > In the meantime I'm going to make a clean patch with the fix above. > > > > Is it ok if I add a reported -by and a tested-by you ? > > Sure- > I just realized that this system has something special. Some month ago I created 2 slices > $ head /etc/systemd/system/*.slice > ==> /etc/systemd/system/machine-production.slice <== > [Unit] > Description=VM production > Before=slices.target > Wants=machine.slice > [Slice] > CPUQuota=2000% > CPUWeight=1000 > > ==> /etc/systemd/system/machine-test.slice <== > [Unit] > Description=VM production > Before=slices.target > Wants=machine.slice > [Slice] > CPUQuota=300% > CPUWeight=100 > > > And the guests are then put into these slices. that also means that this test will never use more than the 2300%. > No matter how much CPUs the system has. Thanks for the information, I will try to see how this could impact the enqueue >