Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5870298pxj; Wed, 23 Jun 2021 10:38:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxLMkaCnqgMPrzTRe0wRU3LYX+eRekPxrjSdfuK9IP0rikgvCLzpWe0UmT0zSS/abKlr0fY X-Received: by 2002:a6b:b554:: with SMTP id e81mr533206iof.163.1624469925820; Wed, 23 Jun 2021 10:38:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624469925; cv=none; d=google.com; s=arc-20160816; b=Z6X2k+2x+/JViyjc1Fx3kFjNjz5otRkMNf+mEy6w03FcwNU/9eebX/Ti/kRGO8T987 7IrLv03pfbNNhsQgavH49Ph4QhFslVfXnLYCjeq/Cz8ClHPo5badvOHEEPeflAl0Pzt6 NycoKTZRgfgzhZQph16Z7ZMEWBWOHkk+o9zUDQP3jFiYBYq70UrdxJJC/oMuZH81wzCV 62+/Dpr7gyGmBJD+ZM556fiSNbIW1wAHtb2KY08SY4UDRCqFWgg3wyVjLp2Juvqdel5f zdpBYDhfC5LFaCefmGK27c590XWKNcWyQRUn85Qbx1ipBRO/0fyKj1DMF3gpHglhZQg1 tf3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=B5o1VpdMcZrmQnyC6VaIuCe6zHlh4bhQF/lUKDsspdg=; b=pfViOEeDn8UiUVF58Pwgh4vwZmxWKwSyCYrz675sTV7v+KgX4yPYn4uo/OzLCqSOGO r+SAAsFTihB+6LP4U2XzzQmIp1Cg36eTFSU25OjW5c9sb/GFq42VCbmRwV6a5BHDqkyL s/dIX3yWpcWMjraARgAVdbViTZy1V0dLIVDHtmjRYihUTSKGNhKb3RQ+dct4rFjwHK7v 2gNOTvL2DQ9H9P7VwqqAYmBy/AQpKzNePiTfzILrYFa545ssAdCeyylrPLN66UNdPXbg gnQqn0Hwx6rQnPcHSFHmHUOVtA1fvodiBgk9vF5SxdUVwEG3xfkwrFA+ZtWX5lumuTi8 whrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@uged.al header.s=google header.b=uyCNzWi7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 10si174940ilj.9.2021.06.23.10.38.33; Wed, 23 Jun 2021 10:38:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@uged.al header.s=google header.b=uyCNzWi7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229822AbhFWRkS (ORCPT + 99 others); Wed, 23 Jun 2021 13:40:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229523AbhFWRkR (ORCPT ); Wed, 23 Jun 2021 13:40:17 -0400 Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E87AC061756 for ; Wed, 23 Jun 2021 10:38:00 -0700 (PDT) Received: by mail-qk1-x72f.google.com with SMTP id bj15so7173302qkb.11 for ; Wed, 23 Jun 2021 10:38:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uged.al; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=B5o1VpdMcZrmQnyC6VaIuCe6zHlh4bhQF/lUKDsspdg=; b=uyCNzWi7QuYZIzhGrc5Gb0sdMjZK3o6D0PuYZzJifbcz5wgi1WYMMQLGlIjgLaTWiV Y4aa2KgTJlvAvMQgmqyviyjzKaEw0jufgBBV0MDw9vzaM9Nm4/BsgAmh0UmIq5qmRIVy iyjfe0Ng8PLGeYGk/YSVvr9pWYTZhy7WC/hqAWcRnXRZE1WQPk4VgRskrvGwvNKXusZ6 b8mwxVPmkgqS0o8quY79rwRVrItOcHsvvrvnr7DBPQXwCW4lqX1En11ySqqyjWAaxLyp h2WPhz0s73bmvf73nVMXj6PPZv9uYKZleOl7R/tkmOjDmAT8zD5eKrJtiOqLI/E/rNkM mSZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=B5o1VpdMcZrmQnyC6VaIuCe6zHlh4bhQF/lUKDsspdg=; b=N9j81TtlFKVkEO0DVUJ6/11qHU4PDremt96EvC+PJO2Gflb6WxH1hBmKGJl2jo91A5 i5o/yM59Wj5UIrmPgFw1L5iNCTbqYFeOqpbB9T7+9Sd/XUT1CC857XDHQaxPFeQ+FQMI Lq4+OxDtmFmmRAsmb2WYmimUq+FsK9vJHt3nbnNozlaNd87OSGOuIuNQerOl5w9rfM7g QsSt3/ZWsqKboI7wyZdy3G5H75EtHZlMSLvjWXvRIfXc+Psxel76cfDev3PifWJbPhs6 GPP7zI44GXd4N9sBrn1ijUBAjT3GGDUxHzFAfm/B/YQ8fERsdHZyVC67NL/wTsbiFZiD U8jQ== X-Gm-Message-State: AOAM532hwTEojgrtjGZ1KRJvQb+EwExFnR9E4UJNSyuA5FtL3CXohhAd 0NPeIjUPuzsyrx7k50F1ch4TZ47U/lHwxszDwLDoBA== X-Received: by 2002:ae9:dd06:: with SMTP id r6mr1270322qkf.74.1624469879194; Wed, 23 Jun 2021 10:37:59 -0700 (PDT) MIME-Version: 1.0 References: <2ED1BDF5-BC0C-47CD-8F33-9A46C738F8CF@linux.vnet.ibm.com> <20210622143154.GA804@vingu-book> <53968DDE-9E93-4CB4-B5E4-526230B6E154@linux.vnet.ibm.com> <20210623071935.GA29143@vingu-book> <6C676AB3-5D06-471A-8715-60AABEBBE392@linux.vnet.ibm.com> <20210623120835.GB29143@vingu-book> <5D874F72-B575-4830-91C3-8814A2B371CD@linux.vnet.ibm.com> In-Reply-To: From: Odin Ugedal Date: Wed, 23 Jun 2021 19:37:23 +0200 Message-ID: Subject: Re: [powerpc][next-20210621] WARNING at kernel/sched/fair.c:3277 during boot To: Vincent Guittot Cc: Sachin Sant , Odin Ugedal , Linux Next Mailing List , linuxppc-dev@lists.ozlabs.org, open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ons. 23. jun. 2021 kl. 19:27 skrev Vincent Guittot : > > On Wed, 23 Jun 2021 at 18:55, Vincent Guittot > wrote: > > > > On Wed, 23 Jun 2021 at 18:46, Sachin Sant wrote: > > > > > > > > > > Ok. This becomes even more weird. Could you share your config file and more details about > > > > you setup ? > > > > > > > > Have you applied the patch below ? > > > > https://lore.kernel.org/lkml/20210621174330.11258-1-vincent.guittot@linaro.org/ > > > > > > > > Regarding the load_avg warning, I can see possible problem during attach. Could you add > > > > the patch below. The load_avg warning seems to happen during boot and sched_entity > > > > creation. > > > > > > > > > > Here is a summary of my testing. > > > > > > I have a POWER box with PowerVM hypervisor. On this box I have a logical partition(LPAR) or guest > > > (allocated with 32 cpus 90G memory) running linux-next. > > > > > > I started with a clean slate. > > > Moved to linux-next 5.13.0-rc7-next-20210622 as base code. > > > Applied patch #1 from Vincent which contains changes to dequeue_load_avg() > > > Applied patch #2 from Vincent which contains changes to enqueue_load_avg() > > > Applied patch #3 from Vincent which contains changes to attach_entity_load_avg() > > > Applied patch #4 from https://lore.kernel.org/lkml/20210621174330.11258-1-vincent.guittot@linaro.org/ > > > > > > With these changes applied I was still able to recreate the issue. I could see kernel warning > > > during boot. > > > > > > I then applied patch #5 from Odin which contains changes to update_cfs_rq_load_avg() > > > > > > With all the 5 patches applied I was able to boot the kernel without any warning messages. > > > I also ran scheduler related tests from ltp (./runltp -f sched) . All tests including cfs_bandwidth01 > > > ran successfully. No kernel warnings were observed. > > > > ok so Odin's patch fixes the problem which highlights that we > > overestimate _sum or don't sync _avg and _sum correctly > > > > I'm going to look at this further > > The problem is "_avg * divider" makes the assumption that all pending > contrib are not null contributions whereas they can be null. Yeah. > Odin patch is the right way to fix this. Other patches should not be > useful for your problem Ack. As I see it, given how PELT works now, it is the only way to mitigate it (without doing a lot of extra PELT stuff). Will post it as a patch together with a proper message later today or tomorrow. > > > > > > > > > Have also attached .config in case it is useful. config has CONFIG_HZ_100=y > > > > Thanks, i will have a look > > > > > > > > Thanks > > > -Sachin > > > Thanks for reporting Sachin! Thanks Odin