Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp8827355ybc; Fri, 29 Nov 2019 17:46:08 -0800 (PST) X-Google-Smtp-Source: APXvYqw6p/f9eiPVUokQ7XUU5czV0+lzYuyI8iOVDIjnUBxsEQYphe+JavkkIqMzvqzyiKbs5L71 X-Received: by 2002:a17:906:b212:: with SMTP id p18mr41267257ejz.208.1575078368470; Fri, 29 Nov 2019 17:46:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575078368; cv=none; d=google.com; s=arc-20160816; b=ekXlzA6/xwRAsohVd1im3rUCbvk7OpeJ7vlQ1DUtvFFqwd/nkQQ7C324+FpIxpivgL 9WEtz1hOIGfXR3Jl1O/1r4FpTeYOYmlsKLPv4mUlXJ1HNnLPB70y+Glv1R8rFKAzVvOD hPiYohWvk5UHN1MbsXKl5VhyKMb4aSzTfMnB1QLfTn05vUKItyrQw6le8/nevk10J2D2 KaduulfUaI4V84ozgiHady5XFr3guqGhnM8Rz8t12Lq3UcAnA9RAQ/y1Gu8OSED8F8lA vWh42EBQ+6lQCLFslmk4fQV4YfdkrW4HrFcwGNZYQcWdE+BW56CPzdGAaPkotsYKyZPu MLJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=vOLjKVo8ueA2YbXi3fZ4PuqiyQNm4BjfA1BtTT0sKxw=; b=WpPYRCxk8H1SnXga3RXQFhVdPP3qXCXUGz391XY5jj9LDOKyUF9+nv5Dx68oYCxqim LAkh4OfOw12wbvd+bXpn5+AT1stq8lm6YXAz+LFlK51ZR2RehSANdc2jIn0ZlI6MwSMr 9rx5tyrNlCADg+FFM4aohbaGKtUCdVO4oG7NvHsGR20HtAl1FyJQbi2/VycC1L5Sg6cH C+wRTBjRoQIOgtbIxG5x3xMD27b+o4s0s8Rep4P5gqefLl0ZBPdefSF6pROJcBpZp82Z xnHeEety4zw/pHRjLKLVzIHZLP2doNx5O/9cFicF4j/Ca4JSqBtKQtNmQlP2IgvwFwHx 6+Bg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Brpl0pHW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id cb5si10595782edb.252.2019.11.29.17.45.44; Fri, 29 Nov 2019 17:46:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Brpl0pHW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727330AbfK3BmK (ORCPT + 99 others); Fri, 29 Nov 2019 20:42:10 -0500 Received: from mail-wm1-f67.google.com ([209.85.128.67]:39576 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727142AbfK3BmJ (ORCPT ); Fri, 29 Nov 2019 20:42:09 -0500 Received: by mail-wm1-f67.google.com with SMTP id s14so10153075wmh.4 for ; Fri, 29 Nov 2019 17:42:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=vOLjKVo8ueA2YbXi3fZ4PuqiyQNm4BjfA1BtTT0sKxw=; b=Brpl0pHW0spKzBZZ9YRMJmC1tYZua+wog5q2An1I0VUbCrzAyaKLHQogB3AlwOHN8M dRvmqZySU+MaM6xoljs1rCemUpz/dxN8XcwqHhrlA+Yfms7/w9fP2Y+Cu2zi9sRb8wmh I9kO4vDDhseKBKFReWzZbNIKZPn1aWNciLL3cE6jHeD8btmErITAqqqBx/mlvRw+CQ7l +VFOIVg5MMcTOoRwuwkWgaNcjABeaB1X+snB2soTF4zk6W5QtFITqDGmoRsHv/Fpj55x +DkLAlAgjBwx+BdY/k/sNPZyEvgEnSoUe4JyVIavOhtKCMIZIs/o/HJyt0JxoXMjy0NT pIFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=vOLjKVo8ueA2YbXi3fZ4PuqiyQNm4BjfA1BtTT0sKxw=; b=WjAn0v6J2tgVDGZFoRi/AkqWKr3Z/8Kf/Kmm/Geaxr954W4RWCChWyBa1jR0wLQRZ0 ow1dvJG7VTC6E4NgfwhH5oBXd4Qo7F+75Ca3XBQ9v0XSmV4NskJghpyxLdmO9EZpH54x CJLGoqq96W7k0ckVLGtJPA5PQIgOC4J+jtfcWsOAUerm6Z7zX+MAoZfK/RHqyD+6+xUd xFFibYL8qkdhqbqTntblWZ5BpEvIaBYE0G4SuRewr7rOUvmCJkchur5O4G/NW+gciIbi CacbYu92tS7NF5dQiuMuh96rymr7wiYxVyysm5qD/glm7PaAAqSty65Po7nou8vGzY4g 4wgA== X-Gm-Message-State: APjAAAUkBqV+t7wydrbMM22ihk8nC5REK0DzAPLtfhZfZoqTgXuDaA1r vZiCRYsV6KYGM1ge97Z3cUpKE3Wejk+YuaIQM2FYSA== X-Received: by 2002:a7b:c303:: with SMTP id k3mr17656620wmj.102.1575078126014; Fri, 29 Nov 2019 17:42:06 -0800 (PST) MIME-Version: 1.0 References: <20191112154144.GC168812@cmpxchg.org> <20191112154844.GD168812@cmpxchg.org> <20191112160821.GE168812@cmpxchg.org> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 29 Nov 2019 17:41:55 -0800 Message-ID: Subject: Re: [PATCH] psi:fix divide by zero in psi_update_stats To: Jingfeng Xie Cc: Johannes Weiner , Ingo Molnar , Peter Zijlstra , LKML , Xunlei Pang , =?UTF-8?B?6b2Q5rGfKOeqhem7mCk=?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 28, 2019 at 10:37 PM Jingfeng Xie wrote: > > Weiner, > The crash does not happen right after boot, in my case, it happens in 58= 914 ~ 815463 seconds range since boot > > With my coredump=EF=BC=8Csome values are extracted as below=EF=BC=9A > > period =3D 001df2dc00000000 > now =3D 001df2dc00000000=EF=BC=8C same as period > expires =3D group->next_update =3D rdi =3D 00003594f700648e > group->avg_last_update could not be known > missed_periods =3D 0 > Considering that "period =3D now - (group->avg_last_update + (missed_periods * psi_period))" and the above values (period=3D=3Dnow and missed_periods=3D=3D0), group->avg_last_update must be 0 and that would mean this is indeed the first update_averages() call. I think this can happen if a cgroup is created long after the boot. The following call chain would happen: cgroup_create->psi_cgroup_alloc->group_init->INIT_DELAYED_WORK->psi_avgs_wo= rk->update_averages. If this cgroup creation is timed so that psi_avgs_work is called when sched_clock returns a value with LSBs of 0 then we get this problem. The patch Johannes posted earlier which sets group->avg_last_update to sched_clock in group_init should have fixed this problem. Tim, did you capture this coredump after applying that patch? If not please try applying it and see if it still happens. > =EF=BB=BF=E5=9C=A8 2019/11/13 =E4=B8=8A=E5=8D=8812:08=EF=BC=8C=E2=80=9CJo= hannes Weiner=E2=80=9D =E5=86=99=E5=85=A5: > > On Tue, Nov 12, 2019 at 10:48:46AM -0500, Johannes Weiner wrote: > > On Tue, Nov 12, 2019 at 10:41:46AM -0500, Johannes Weiner wrote: > > > On Fri, Nov 08, 2019 at 03:33:24PM +0800, tim wrote: > > > > In psi_update_stats, it is possible that period has value like > > > > 0xXXXXXXXX00000000 where the lower 32 bit is 0, then it calls d= iv_u64 which > > > > truncates u64 period to u32, results in zero divisor. > > > > Use div64_u64() instead of div_u64() if the divisor is u64 to = avoid > > > > truncation to 32-bit on 64-bit platforms. > > > > > > > > Signed-off-by: xiejingfeng > > > > > > This is legit. When we stop the periodic averaging worker due to = an > > > idle CPU, the period after restart can be much longer than the ~4= sec > > > in the lower 32 bits. See the missed_periods logic in update_aver= ages. > > > > Argh, that's not right. Of course I notice right after hitting send= . > > > > missed_periods are subtracted out of the difference between now and > > the last update, so period should be not much bigger than 2s. > > > > Something else is going on here. > > Tim, does this happen right after boot? I wonder if it's because we'r= e > not initializing avg_last_update, and the initial delta between the > last update (0) and the first scheduled update (sched_clock() + 2s) > ends up bigger than 4 seconds somehow. Later on, the delta between th= e > last and the scheduled update should always be ~2s. But for that to > happen, it would require a pretty slow boot, or a sched_clock() that > does not start at 0. > > Tim, if you have a coredump, can you extract the value of the other > variables printed in the following patch? > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 84af7aa158bf..1b6836d23091 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -374,6 +374,10 @@ static u64 update_averages(struct psi_group *gro= up, u64 now) > */ > avg_next_update =3D expires + ((1 + missed_periods) * psi_period)= ; > period =3D now - (group->avg_last_update + (missed_periods * psi_= period)); > + > + WARN(period >> 32, "period=3D%ld now=3D%ld expires=3D%ld last=3D%= ld missed=3D%ld\n", > + period, now, expires, group->avg_last_update, missed_periods= ); > + > group->avg_last_update =3D now; > > for (s =3D 0; s < NR_PSI_STATES - 1; s++) { > > And we may need something like this to make the tick initialization > more robust regardless of the reported bug here: > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 84af7aa158bf..ce8f6748678a 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -185,7 +185,8 @@ static void group_init(struct psi_group *group) > > for_each_possible_cpu(cpu) > seqcount_init(&per_cpu_ptr(group->pcpu, cpu)->seq); > - group->avg_next_update =3D sched_clock() + psi_period; > + group->avg_last_update =3D sched_clock(); > + group->avg_next_update =3D group->avg_last_update + psi_period; > INIT_DELAYED_WORK(&group->avgs_work, psi_avgs_work); > mutex_init(&group->avgs_lock); > /* Init trigger-related members */ > > > >