Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1863670imm; Thu, 18 Oct 2018 05:38:23 -0700 (PDT) X-Google-Smtp-Source: ACcGV60alCnV8NzfvHPbyAAkQE/rBWdIsKi54w5Zy0FLr/BLJfBa+3KkEbAaP4vEX5ZbaU31YeXS X-Received: by 2002:a65:5c81:: with SMTP id a1-v6mr28219466pgt.390.1539866303898; Thu, 18 Oct 2018 05:38:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539866303; cv=none; d=google.com; s=arc-20160816; b=xGjzU4Cb4BRWNku9kBojcZbLw4+ibOZsA4L0H3jyC0kvuIy9swU0S3f5pXV23ionAe koN/hGsvEKjWzKYvv6N5YihET0cqPkI7ct+hTak0Gku94IWVPfpaAwJhcIwSqBsN4Vya KJooZyKB2JgnH4EFE6eZ/NSMsmaq0Zk0+ygYixiU4h+rsPd1ZXgEpI0ZPwlPpTOiH0Uf ORrCU5xRs3EBPoD0ltlHq4q8lUxEmQWMoHjbPsnBUUJXkpIULYDOhxd4VnSR0W25g0sC sQOvjTTKzxHFqhj6SjbRaQU2tU9at7P0I7Y5zqQHkH6f74m+peTsBF+ynVS3Jxxaru1Y LgYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=PJt6Kbk2dAEPCcA4tym/HnxGOLHhOYo5dKy+/+ieG/Y=; b=LfoiEAaf3+RRuSqwEWDaFAaWfv/dTTmn1B3xcciTmLgZlYGbfwww3+l2Xj54rO9OdD KmGWJqA3NammoKubo6CNqmpyPmE/FB+6nfiYiUCIOfKRU1NtG7RrbzCTj4HbF+yXPrit M5qsK0iIXurlKLBhBis/smQZyU1hMZnmK7Ztq+sploxOPDyQly+lFN92cm6j9+e042yy wahNsnoHWe4ydQ84PLGIUhtqBhy8c8y/vlAijlQw/9v4fLb2uV4D2+5lEskncJrCTz+O J9dKCqgzpT/ZJxXO8Rv9NE27Njxy9LYpObSDNOFMcuSBHiRjQkuU+J3lAo3Vopzl+i9k DImw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d81-v6si22451974pfm.40.2018.10.18.05.38.07; Thu, 18 Oct 2018 05:38:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727083AbeJRUhC (ORCPT + 99 others); Thu, 18 Oct 2018 16:37:02 -0400 Received: from mail.santannapisa.it ([193.205.80.98]:41469 "EHLO mail.santannapisa.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726417AbeJRUhC (ORCPT ); Thu, 18 Oct 2018 16:37:02 -0400 Received: from [10.30.3.207] (account l.abeni@santannapisa.it HELO luca64) by santannapisa.it (CommuniGate Pro SMTP 6.1.11) with ESMTPSA id 133799600; Thu, 18 Oct 2018 14:36:10 +0200 Date: Thu, 18 Oct 2018 14:36:05 +0200 From: luca abeni To: Juri Lelli Cc: Thomas Gleixner , Juri Lelli , Peter Zijlstra , syzbot , Borislav Petkov , "H. Peter Anvin" , LKML , mingo@redhat.com, nstange@suse.de, syzkaller-bugs@googlegroups.com, henrik@austad.us, Tommaso Cucinotta , Claudio Scordino , Daniel Bristot de Oliveira Subject: Re: INFO: rcu detected stall in do_idle Message-ID: <20181018143605.6ce5f208@luca64> In-Reply-To: <20181018122142.GF21611@localhost.localdomain> References: <000000000000a4ee200578172fde@google.com> <20181016140322.GB3121@hirez.programming.kicks-ass.net> <20181016144045.GF9130@localhost.localdomain> <20181016153608.GH9130@localhost.localdomain> <20181018082838.GA21611@localhost.localdomain> <20181018122331.50ed3212@luca64> <20181018104713.GC21611@localhost.localdomain> <20181018130811.61337932@luca64> <20181018122142.GF21611@localhost.localdomain> Organization: Scuola Superiore S. Anna X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Juri, On Thu, 18 Oct 2018 14:21:42 +0200 Juri Lelli wrote: [...] > > > > I missed the original emails, but maybe the issue is that the > > > > task blocks before the tick, and when it wakes up again > > > > something goes wrong with the deadline and runtime assignment? > > > > (maybe because the deadline is in the past?) > > > > > > No, the problem is that the task won't be throttled at all, > > > because its replenishing instant is always way in the past when > > > tick occurs. :-/ > > > > Ok, I see the issue now: the problem is that the "while > > (dl_se->runtime <= 0)" loop is executed at replenishment time, but > > the deadline should be postponed at enforcement time. > > > > I mean: in update_curr_dl() we do: > > dl_se->runtime -= scaled_delta_exec; > > if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) { > > ... > > enqueue replenishment timer at dl_next_period(dl_se) > > But dl_next_period() is based on a "wrong" deadline! > > > > > > I think that inserting a > > while (dl_se->runtime <= -pi_se->dl_runtime) { > > dl_se->deadline += pi_se->dl_period; > > dl_se->runtime += pi_se->dl_runtime; > > } > > immediately after "dl_se->runtime -= scaled_delta_exec;" would fix > > the problem, no? > > Mmm, I also thought of letting the task "pay back" its overrunning. > But, doesn't this get us quite far from what one would expect. I mean, > enforcement granularity will be way different from task period, no? Yes, the granularity will be what the kernel can provide (due to the HZ value and to the hrtick on/off state). But at least the task will not starve non-deadline tasks (which is bug that originated this discussion, I think). If I understand well, there are two different (and orthogonal) issues here: 1) Due to a bug in the accounting / enforcement mechanisms (the wrong placement of the while() loop), the tasks consumes 100% of the CPU time, starving non-deadline tasks 2) Due to the large HZ value, the small runtime (and period) and the fact that hrtick is disabled, the kernel cannot provide the requested scheduling granularity The second issue can be fixed by imposing limits on minimum and maximum runtime and the first issue can be fixed by changing the code as I suggested in my previous email. I would suggest to address both the two issues, with separate changes (the current replenishment code looks strange anyway). Luca