Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp807189ybc; Tue, 12 Nov 2019 09:28:39 -0800 (PST) X-Google-Smtp-Source: APXvYqzgDSRUTKtmhfcuYd11AYq1boK5TLaGjRFJZcGy2e5Rdnh4Q9xvZ+yrcyXvpbJwep9E6qt9 X-Received: by 2002:a17:906:52d3:: with SMTP id w19mr30043561ejn.268.1573579719392; Tue, 12 Nov 2019 09:28:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573579719; cv=none; d=google.com; s=arc-20160816; b=Chvk+P9ciwkyvlqOyZlKXeWTva7cyFGw9pAsnJvMPc0o3urHeJF0PbHCcBRJcH1zT5 Ct+cKc/Uf6y1lAjos1RozNpMXjqRW1ehcQx+69k4KtfPBJGchNwCcDKaZJXvrwqLSQzk OkmshRZWyBCLci4Ysz9GPOcwZ0SWe4D8Mx8ZKC+OU+9UJunjdcI3vmPCNjedRWJ0FFUi Z0FZz09rPq0caa6A5vVP1+LacHa2KIqU9oscWg7qje1hM5qgF4ll06IrIUXnVxhQGdwi 8jzVKEjWV9VoDzSaUnm8ct87Fnhlg5s+hu5rJDdFW/SBrAqYArjOM0gWh5bMKN6QjWEK jf0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ezsIhe1GNm16S3puMGIvpD+Jx3sztUBZzOXQophYY/I=; b=iHLJpWit/YTwWNME/ATWteZIDLZFATHGhWIuvb1LVM0InhOpVZlfefR8uXnGdf9aIK VihuFCzIWFuRGjekVfpXHBj/C6MJ02e+baVEk58r9F+ytwAkH5jHZjCDrkG5jtsfzFQR GuSETNx2jHzZDdf9CdEjuqPsynD63YFPO0wDYtssVO10KEzJsG378QtBZpH5ZioXEUEJ D0aCmZDdneRyhm7497V9LTVnwijElvAF4pLAzRCsxqOQeD931LydE61+8EJ3wPICuE99 sB9Tok9VN1XgWF3Y+OF3HKzQZp1QmdTovbMQaPdT4xklZnmLeO6g1cBO28ExjX0yCrFP BL7Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=FEEmTpeN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g28si14113302edg.327.2019.11.12.09.28.14; Tue, 12 Nov 2019 09:28:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=FEEmTpeN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726980AbfKLR1Y (ORCPT + 99 others); Tue, 12 Nov 2019 12:27:24 -0500 Received: from mail-wm1-f68.google.com ([209.85.128.68]:52413 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726738AbfKLR1X (ORCPT ); Tue, 12 Nov 2019 12:27:23 -0500 Received: by mail-wm1-f68.google.com with SMTP id l1so4144350wme.2 for ; Tue, 12 Nov 2019 09:27:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ezsIhe1GNm16S3puMGIvpD+Jx3sztUBZzOXQophYY/I=; b=FEEmTpeN5ZrY/aT3U30uEYPNrZUpIU5QYesr9P9pwYf1kIjxFGwV0paE4lMJopATyv GMqtHWLx/SEAyZ77ZTrXVAh+wQIRiGjWy6vBSD79bSsQR3JIexq2b47Xzr/40CgxOQf9 7OTTamegj8Trj3dx8oEco9asudnNA3GiCEQ0SPyGcrv/xIV0xj5CRr4S/oKusox7WASI C6RwsPK/ecuBbJALdvH8KhYdryyYhoYl5Tf2iMRmZbQ893rKH1ujCeuzT9vDrGHAG7A9 N/MKSSTMbQT4eseL8Bu9/8Xsuti8BgR7rWRg0+Fcjt75/RqQilxjA9Te1XskUKMBxFwi TJSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ezsIhe1GNm16S3puMGIvpD+Jx3sztUBZzOXQophYY/I=; b=ga7/pNCLGSf2gT0VIDPlriuwk8dsUzHvUhNENznMl25cW8m8iq/mDoWcasx0GsekMe iGzXmOB9lKRfjfzdNVDhKDMP2tzzgLelCZeSwAfj9YXHGImASjrwPkW/LcEbToViUIL4 4+h5oFZE8r48cjP1kxjmGgCK4RSZBws2Dr6DkD1M9SCnh9pysF5BVhY3lqXwJV9qa/9U KjuJkSlktZ5IX/nZV9lPWIwa47m04xXKdzhfeEoNHbsKIzy/VaR4msvu20yRqJPowQ18 WMgnEcDBuONbfbICRQdNmLvBR1fZ8TUB2VMjjA1XtbDBd4CzlywTFwGQzUg68heEO2HQ 9Vsw== X-Gm-Message-State: APjAAAWH431VoPvNUMyQju7XcofC98oQqsudCfuYmQ5ufK6gU2M1H0vG 6Rs0zNuHtpv3EoBTd6w2UOYW568z2LS4Y7BSsPX2bw== X-Received: by 2002:a1c:8055:: with SMTP id b82mr5217623wmd.176.1573579640763; Tue, 12 Nov 2019 09:27:20 -0800 (PST) MIME-Version: 1.0 References: <20191112154144.GC168812@cmpxchg.org> <20191112154844.GD168812@cmpxchg.org> <20191112160821.GE168812@cmpxchg.org> In-Reply-To: <20191112160821.GE168812@cmpxchg.org> From: Suren Baghdasaryan Date: Tue, 12 Nov 2019 09:27:09 -0800 Message-ID: Subject: Re: [PATCH] psi:fix divide by zero in psi_update_stats To: Johannes Weiner Cc: tim , Ingo Molnar , Peter Zijlstra , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 12, 2019 at 8:08 AM Johannes Weiner wrote: > > On Tue, Nov 12, 2019 at 10:48:46AM -0500, Johannes Weiner wrote: > > On Tue, Nov 12, 2019 at 10:41:46AM -0500, Johannes Weiner wrote: > > > On Fri, Nov 08, 2019 at 03:33:24PM +0800, tim wrote: > > > > In psi_update_stats, it is possible that period has value like > > > > 0xXXXXXXXX00000000 where the lower 32 bit is 0, then it calls div_u64 which > > > > truncates u64 period to u32, results in zero divisor. > > > > Use div64_u64() instead of div_u64() if the divisor is u64 to avoid > > > > truncation to 32-bit on 64-bit platforms. > > > > > > > > Signed-off-by: xiejingfeng > > > > > > This is legit. When we stop the periodic averaging worker due to an > > > idle CPU, the period after restart can be much longer than the ~4 sec > > > in the lower 32 bits. See the missed_periods logic in update_averages. > > > > Argh, that's not right. Of course I notice right after hitting send. > > > > missed_periods are subtracted out of the difference between now and > > the last update, so period should be not much bigger than 2s. > > > > Something else is going on here. > > Tim, does this happen right after boot? I wonder if it's because we're > not initializing avg_last_update, and the initial delta between the > last update (0) and the first scheduled update (sched_clock() + 2s) > ends up bigger than 4 seconds somehow. Later on, the delta between the > last and the scheduled update should always be ~2s. But for that to > happen, it would require a pretty slow boot, or a sched_clock() that > does not start at 0. > > Tim, if you have a coredump, can you extract the value of the other > variables printed in the following patch? > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 84af7aa158bf..1b6836d23091 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -374,6 +374,10 @@ static u64 update_averages(struct psi_group *group, u64 now) > */ > avg_next_update = expires + ((1 + missed_periods) * psi_period); > period = now - (group->avg_last_update + (missed_periods * psi_period)); > + > + WARN(period >> 32, "period=%ld now=%ld expires=%ld last=%ld missed=%ld\n", > + period, now, expires, group->avg_last_update, missed_periods); > + > group->avg_last_update = now; > > for (s = 0; s < NR_PSI_STATES - 1; s++) { > > And we may need something like this to make the tick initialization > more robust regardless of the reported bug here: > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 84af7aa158bf..ce8f6748678a 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -185,7 +185,8 @@ static void group_init(struct psi_group *group) > > for_each_possible_cpu(cpu) > seqcount_init(&per_cpu_ptr(group->pcpu, cpu)->seq); > - group->avg_next_update = sched_clock() + psi_period; > + group->avg_last_update = sched_clock(); > + group->avg_next_update = group->avg_last_update + psi_period; > INIT_DELAYED_WORK(&group->avgs_work, psi_avgs_work); > mutex_init(&group->avgs_lock); > /* Init trigger-related members */ Both fixes for group_init() and window_update() make sense to me. window_update() division would be reproducible because win->size is set during trigger setup and does not change afterwards. Since userspace defines the window size in usecs this would require doing some math and finding a value that yields zeros in 32 LSBs after conversion into nsecs (see: t->win.size = window_us * NSEC_PER_USEC in psi_trigger_create()). Haven't seen this issue because all my test cases (1, 2, 10secs) had non-zero LSBs.