Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp319476pxa; Wed, 5 Aug 2020 01:52:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz/pcWukej10aZMetkxugfkIHEEZsj45O7euVVGJ1xD17ojKOr8YNtP30X/HK46IlZAvBAE X-Received: by 2002:a17:906:3050:: with SMTP id d16mr2282096ejd.12.1596617579364; Wed, 05 Aug 2020 01:52:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596617579; cv=none; d=google.com; s=arc-20160816; b=pUMK8UTJ5rH+dL5+GM4xwV6YTiqD7WZyM7262jmZx0Nwkvk58W236xinwz2Q69ZNGJ 3cWfc3r874O4Imy70T5ryoORKiE8/w+OJa0jbVw6rK2jqqBk870C68C3QFU1DbVG1W7c mF+qdpOnu77I+jjy4YYsH5CRVJxZnk6dq7b5KwfOz6UGeSCNbVknvhrk3eJ1XB97+KLC IEkzIvcaQpRK8yIPDPvP2MKYpsfEG5hgVhxL9YBlUQX2zGVbciqxbOWjMGFk5OfXu8Jk D51fkb3MCcJAI2vRTS5krHw5Q/QVWtRfloOjNc7rTGsskBWlvZzky497kPk+RZuSybCy 0DRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=0FICPAJL6V53LNdvbcNgVbchA/PBP0+jn0r2OuqJ4lQ=; b=K1ia3+Cpw0RXlhPlNdhqEkym3LdSuFVZgMjV1vSnhPGACO+JXPutNpYd+ANIqe1ZsV yp9wiqbtzfyJQ7RbZX+GJbhnEyjUYzyrbedkFEaeXB2/2GR3EumqNuYQ3SEcRcD0vht/ 7BvhSPyunmzJ2mucttM9kF4vDhFDOwHFkMB7kqn5RP5FI7oC6xPlYWOaI3iCM8uF6kbx hRF9sayhD5tari59EhRMQCRzhkvF19mX9tDdvkqNsamggsJJgLEDe8T1N1rn+iWLWdYw oFuUuCvyECa09pWt4KhjCSWwYw/6JYAfchJGENfwsWyv3NG+Tv/0hSg5kikEGq+nigdH b8MA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k20si883416ejq.330.2020.08.05.01.52.36; Wed, 05 Aug 2020 01:52:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726841AbgHEIue (ORCPT + 99 others); Wed, 5 Aug 2020 04:50:34 -0400 Received: from foss.arm.com ([217.140.110.172]:56100 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725868AbgHEIud (ORCPT ); Wed, 5 Aug 2020 04:50:33 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 14D5CD6E; Wed, 5 Aug 2020 01:50:33 -0700 (PDT) Received: from [192.168.178.2] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DDF223F6CF; Wed, 5 Aug 2020 01:50:30 -0700 (PDT) Subject: Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting To: Valentin Schneider , Thomas Gleixner Cc: Vladimir Oltean , Kurt Kanzenbach , Alison Wang , catalin.marinas@arm.com, will@kernel.org, paulmck@kernel.org, mw@semihalf.com, leoyang.li@nxp.com, vladimir.oltean@nxp.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Anna-Maria Gleixner , Peter Zijlstra References: <20200729033934.22349-1-alison.wang@nxp.com> <877dumbtoi.fsf@kurt> <20200729094943.lsmhsqlnl7rlnl6f@skbuf> <87mu3ho48v.fsf@kurt> <20200730082228.r24zgdeiofvwxijm@skbuf> <873654m9zi.fsf@kurt> <87lfiwm2bj.fsf@nanos.tec.linutronix.de> <20200803114112.mrcuupz4ir5uqlp6@skbuf> <87d047n4oh.fsf@nanos.tec.linutronix.de> <875z9zmt4i.fsf@nanos.tec.linutronix.de> From: Dietmar Eggemann Message-ID: <02195130-3d9a-a206-d931-fab7dc606061@arm.com> Date: Wed, 5 Aug 2020 10:50:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/08/2020 01:59, Valentin Schneider wrote: > > On 03/08/20 20:22, Thomas Gleixner wrote: >> Valentin, >> >> Valentin Schneider writes: >>> On 03/08/20 16:13, Thomas Gleixner wrote: >>>> Vladimir Oltean writes: >>>>>> 1) When irq accounting is disabled, RT throttling kicks in as >>>>>> expected. >>>>>> >>>>>> 2) With irq accounting the RT throttler does not kick in and the RCU >>>>>> stall/lockups happen. >>>>> What is this telling us? >>>> >>>> It seems that the fine grained irq time accounting affects the runtime >>>> accounting in some way which I haven't figured out yet. >>>> >>> >>> With IRQ_TIME_ACCOUNTING, rq_clock_task() will always be incremented by a >>> lesser-or-equal value than when not having the option; you start with the >>> same delta_exec but slice some for the IRQ accounting, and leave the rest >>> for the rq_clock_task() (+paravirt). >>> >>> IIUC this means that if you spend e.g. 10% of the time in IRQ and 90% of >>> the time running the stress-ng RT tasks, despite having RT tasks hogging >>> the entirety of the "available time" it is still only 90% runtime, which is >>> below the 95% default and the throttling doesn't happen. >> >> totaltime = irqtime + tasktime >> >> Ignoring irqtime and pretending that totaltime is what the scheduler >> can control and deal with is naive at best. >> > > Agreed, however AFAICT rt_time is only incremented by rq_clock_task() > deltas, which don't include IRQ time with IRQ_TIME_ACCOUNTING=y. That would > then be directly compared to the sysctl runtime. > > Adding some prints in sched_rt_runtime_exceeded() and running this test > case on my Juno, I get: > # IRQ_TIME_ACCOUNTING=y > cpu=2 rt_time=713455220 runtime=950000000 rq->avg_irq.util_avg=265 > (rt_time oscillates between [70.1e7, 75.1e7]; avg_irq between [220, 270]) > > # IRQ_TIME_ACCOUNTING=n > cpu=2 rt_time=963035300 runtime=949951811 > (rt_time oscillates between [94.1e7, 96.1e7]; > > Throttling happens for IRQ_TIME_ACCOUNTING=n and doesn't for > IRQ_TIME_ACCOUNTING=y - clearly the accounted rt_time isn't high enough for > that to happen, and it does look like what is missing in rt_time (or what > should be subtracted from the available runtime) is there in the avg_irq. I agree that w/ IRQ_TIME_ACCOUNTING=y rt_rq->rt_time isn't high enough in this testcase. stress-ng-hrtim-1655 [001] 462.897733: bprint: update_curr_rt: rt_rq->rt_time=416716900 rt_rq->rt_runtime=950000000 rt_b->rt_runtime=950000000 The 5% reservation (1 - sched_rt_runtime_us/sched_rt_period_us) for CFS is massively eclipsed by irqtime. It's true that avg_irq tracks 'irq_delta + steal' time but it is meant to potentially reduce cpu capacity. It's also cpu and frequency invariant (your CPU2 is a big CPU so no issue here). Could a rq_clock(rq) derived rt_rq signal been used to compare against rt_runtime? BTW, DL already influences rt_rq->rt_time. [...]