Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5863133pxj; Wed, 23 Jun 2021 10:29:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwws33/JqnV7IIqW6lSo+qcrp74ATJs2YoR/tqyOxnChg7eLYqwIz9hT45wSO/2xdyu4soX X-Received: by 2002:a6b:6617:: with SMTP id a23mr537258ioc.0.1624469354505; Wed, 23 Jun 2021 10:29:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624469354; cv=none; d=google.com; s=arc-20160816; b=IvJSzAnDJHhof7PRwI5nhhDMVjh6pXf9sM9aSgEdfeCY1rZ1IVUYSptjs/59rMKA0U WF4SMul5fH3SpYAO9djD6sQ9Sn3I1A9m46HXmUH1ioZhcruHSRxqIqH295Rpm8vMlasd m7i4If0M7K3L2ZPkaKr8Daop50D7dljnMMExulLLCa9PCvzdp7PYGOxsguf7wWcWkmNK dxsHxsLwYMf9wn3Hu4CG64dneak7XJom4S4ZD3SVHYDedxinoQdyCNpFPpxXG2KSoC87 R2pj3AKCzD2iCROlvtYqMb8o9a0g/yM+qrQzdCu1tsWoUkdxVe8OVPc0J/qWev4zhMGp rb5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=q2d2sQHy0KntFj7PyfcvDTPhCqiaY/wl1GYeGVevyXY=; b=NilY25uNPSWrspZTcJ4s0wUB21hTJDrGyjCIlQ9wkE0hK6WkApaKdiSilz4kzun5/o cdvf9/123Vl4OMxRIEqR7mVxpVqdInrprTj+2usci8KUWF1ZYwL414MLqDfSMSinsyFd tdknUJ8nipUtkgSo1u+ErQvdoE8fvcQtaE3N3+PVbZYtg/tIFjXKXxvpCdh498xmgmt3 iyN7NLyFemSnmPQgkLwyPob76+3FJfUhCz2TSiebxvGXeB3lEoVvw0ZiY+g3S7xWS2JX 9xWKew/iI2gz7LrISP/xZDe6tOq/wa4qoKmCvoLploEP66TuDZD6VXXVaJ9l/bYuTx7t w/WQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cPVdN6PD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h5si578970iol.44.2021.06.23.10.29.02; Wed, 23 Jun 2021 10:29:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cPVdN6PD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229886AbhFWR3j (ORCPT + 99 others); Wed, 23 Jun 2021 13:29:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229774AbhFWR3i (ORCPT ); Wed, 23 Jun 2021 13:29:38 -0400 Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 110B9C061756 for ; Wed, 23 Jun 2021 10:27:20 -0700 (PDT) Received: by mail-lj1-x22f.google.com with SMTP id d2so3972694ljj.11 for ; Wed, 23 Jun 2021 10:27:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=q2d2sQHy0KntFj7PyfcvDTPhCqiaY/wl1GYeGVevyXY=; b=cPVdN6PDfWFEKRnik7cWwwE2fuQoRkEUgYdGwM27Vslnjn6NScjASeLKUvFlLUy1fa YXUAtS1SSrQbJJQlZ2efFgnnrPWZiW04wQ7FvzRE38YrO12OoZGIOklD5hkfp6MB68OA SfQUOdH7RhRTnUfD/jg5jAIXBApRwAtdcgh895kz/BsXk26Ga/0j5nPYOctu/Pw5UgmT 1JMpZBvjggPUqdlW5aZtJUx5D0OCzO0zs8ct+cEbq/aIBxEevuysksboegpZhScn9etZ e7FZI6h13VkR/pzJv64UlKW7uEqGqSAcjFBGzC2s/TeO0Sza0buUk/FtfmJo2PWSVNKE hW1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=q2d2sQHy0KntFj7PyfcvDTPhCqiaY/wl1GYeGVevyXY=; b=p5xsAMQzEsvWXuhWMdSR7zJDFdxjT4kQ86y6c9jEzxCrcMEVmGy5AY5xEwGTYsKTpZ NxProGPqXXQD09MMEPEQh8dxr6je78PrWarMetEMu+BuEXb2g2iHqh+b2+pgNnLKBr8R uXmk70A9Mzho2DzrKEVSgBGdnr9d91nvR2uKFYdGDMDpydVHm9pjMmSoSJB3XNEJi63F qmsG+ZXoQ5LWGa4FKutjZfnvp41v+C6zUr1LK73G1XBnuv7LxebmYmtMa0be0tF3nufu biIAyyQQDdRTqDKGza1hz/DGIVmLBGFMZfs34DldbCTj9hAvSmYHhcRSPvHs0VeYZIZd s4Ww== X-Gm-Message-State: AOAM533VK+mxi5++RYkPG7K5sTV1Btkn04T+QWSttsCUivneFF9PhAXm HAdNlrE0+OPJClG4cfqPUqmUqr+rbu7P1hRU6eiPRQ== X-Received: by 2002:a2e:9b07:: with SMTP id u7mr585757lji.209.1624469238388; Wed, 23 Jun 2021 10:27:18 -0700 (PDT) MIME-Version: 1.0 References: <2ED1BDF5-BC0C-47CD-8F33-9A46C738F8CF@linux.vnet.ibm.com> <20210622143154.GA804@vingu-book> <53968DDE-9E93-4CB4-B5E4-526230B6E154@linux.vnet.ibm.com> <20210623071935.GA29143@vingu-book> <6C676AB3-5D06-471A-8715-60AABEBBE392@linux.vnet.ibm.com> <20210623120835.GB29143@vingu-book> <5D874F72-B575-4830-91C3-8814A2B371CD@linux.vnet.ibm.com> In-Reply-To: From: Vincent Guittot Date: Wed, 23 Jun 2021 19:27:07 +0200 Message-ID: Subject: Re: [powerpc][next-20210621] WARNING at kernel/sched/fair.c:3277 during boot To: Sachin Sant Cc: Odin Ugedal , Linux Next Mailing List , linuxppc-dev@lists.ozlabs.org, open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 23 Jun 2021 at 18:55, Vincent Guittot wrote: > > On Wed, 23 Jun 2021 at 18:46, Sachin Sant wrote: > > > > > > > Ok. This becomes even more weird. Could you share your config file and more details about > > > you setup ? > > > > > > Have you applied the patch below ? > > > https://lore.kernel.org/lkml/20210621174330.11258-1-vincent.guittot@linaro.org/ > > > > > > Regarding the load_avg warning, I can see possible problem during attach. Could you add > > > the patch below. The load_avg warning seems to happen during boot and sched_entity > > > creation. > > > > > > > Here is a summary of my testing. > > > > I have a POWER box with PowerVM hypervisor. On this box I have a logical partition(LPAR) or guest > > (allocated with 32 cpus 90G memory) running linux-next. > > > > I started with a clean slate. > > Moved to linux-next 5.13.0-rc7-next-20210622 as base code. > > Applied patch #1 from Vincent which contains changes to dequeue_load_avg() > > Applied patch #2 from Vincent which contains changes to enqueue_load_avg() > > Applied patch #3 from Vincent which contains changes to attach_entity_load_avg() > > Applied patch #4 from https://lore.kernel.org/lkml/20210621174330.11258-1-vincent.guittot@linaro.org/ > > > > With these changes applied I was still able to recreate the issue. I could see kernel warning > > during boot. > > > > I then applied patch #5 from Odin which contains changes to update_cfs_rq_load_avg() > > > > With all the 5 patches applied I was able to boot the kernel without any warning messages. > > I also ran scheduler related tests from ltp (./runltp -f sched) . All tests including cfs_bandwidth01 > > ran successfully. No kernel warnings were observed. > > ok so Odin's patch fixes the problem which highlights that we > overestimate _sum or don't sync _avg and _sum correctly > > I'm going to look at this further The problem is "_avg * divider" makes the assumption that all pending contrib are not null contributions whereas they can be null. Odin patch is the right way to fix this. Other patches should not be useful for your problem > > > > > Have also attached .config in case it is useful. config has CONFIG_HZ_100=y > > Thanks, i will have a look > > > > > Thanks > > -Sachin > >