Received: by 2002:a05:6a10:6006:0:0:0:0 with SMTP id w6csp1139481pxa; Fri, 28 Aug 2020 05:01:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzV0rhRlkEVZqKBGAAQzWTYn9Nak0mPvFp/6DlksiU2apuERB0DLRUW7jAYtpCAIOmM/YJ/ X-Received: by 2002:a17:906:4a0d:: with SMTP id w13mr1504201eju.156.1598616104983; Fri, 28 Aug 2020 05:01:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598616104; cv=none; d=google.com; s=arc-20160816; b=zX/PFiXge8JfWM7dYVirkKZg+Kgc5AGKs7zh9lUrCvWc76pAo8OMAay2xYZ1nyo8CL s5TjZ1No/htMdlk2JeH3YR/yhQ8U2i2TD80hfGHVFxUI/QH4SZPsYbqXKJKtEY3YE8xF 8TiT0fp8wbAUMblgNZ4Bxh8jrcTDoofvXOkU4Pw5EROfaQQ+uak1CupybiiHt1XMsAeL mn4J/M9/t8Y/y98BuKf+gLHeg0xqAYa7GFtqiW4oHeriwckryh02AfKNHOwBxVWPqptk u+SUiJjtl+KLmKq7afUPwgXjUZmmYa+Fgir4vT9LH+9K6NBYsZF6Ktl9cELbb3akxnTL WEKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=sZpYQwsmXLu/emeI3IHFxyphnD+aOogpCi5qhE/kT3A=; b=Mis6b/BTGWp6ExMNFGux6zxGHkEy4LhOCNZ4llXPPTU7ErNU6kxVL72H4/h7oGJOtH 8oQ59S2BJXbmb0T59pbnqQaJNNQiK1+IaAar0ISpKIvMjK4A+UVin3slTwOOTrGrNqUp ImV0XSKmatXktbuYRGHk9ToQ9Q0vp6/dnrTY+J1tYsmaici6E+PLQ4xXDzM4hxGqpMko 8T9MCiK8VinCQfbsE/qH1EUmt2MM5mVaYt+cxi4lbXjTVkdKazyIe5N2tzF5QTdZ91bD 6ajJgW6jE3yBEbrzeMIPryR7BgiUAwpZKEOVq+2k2X+ilW3JY60Czs62KH4i9+SAZBsd 0vdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 4si440421edw.95.2020.08.28.05.01.21; Fri, 28 Aug 2020 05:01:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729261AbgH1MAl convert rfc822-to-8bit (ORCPT + 99 others); Fri, 28 Aug 2020 08:00:41 -0400 Received: from smtp.h3c.com ([60.191.123.56]:5969 "EHLO h3cspam01-ex.h3c.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726904AbgH1MAN (ORCPT ); Fri, 28 Aug 2020 08:00:13 -0400 Received: from DAG2EX06-IDC.srv.huawei-3com.com ([10.8.0.69]) by h3cspam01-ex.h3c.com with ESMTPS id 07SBvFgg038729 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 28 Aug 2020 19:57:15 +0800 (GMT-8) (envelope-from tian.xianting@h3c.com) Received: from DAG2EX03-BASE.srv.huawei-3com.com (10.8.0.66) by DAG2EX06-IDC.srv.huawei-3com.com (10.8.0.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Fri, 28 Aug 2020 19:57:18 +0800 Received: from DAG2EX03-BASE.srv.huawei-3com.com ([fe80::5d18:e01c:bbbd:c074]) by DAG2EX03-BASE.srv.huawei-3com.com ([fe80::5d18:e01c:bbbd:c074%7]) with mapi id 15.01.1713.004; Fri, 28 Aug 2020 19:57:18 +0800 From: Tianxianting To: "peterz@infradead.org" , Jan Kara CC: "viro@zeniv.linux.org.uk" , "bcrl@kvack.org" , "mingo@redhat.com" , "juri.lelli@redhat.com" , "vincent.guittot@linaro.org" , "dietmar.eggemann@arm.com" , "rostedt@goodmis.org" , "bsegall@google.com" , "mgorman@suse.de" , "linux-fsdevel@vger.kernel.org" , "linux-aio@kvack.org" , "linux-kernel@vger.kernel.org" , Tejun Heo , "hannes@cmpxchg.org" Subject: RE: [PATCH] aio: make aio wait path to account iowait time Thread-Topic: [PATCH] aio: make aio wait path to account iowait time Thread-Index: AQHWfQJrlBVndhGZlkOx+wrnBltm+qlMtS6AgAAJgICAABOrgIAAlxcg Date: Fri, 28 Aug 2020 11:57:18 +0000 Message-ID: References: <20200828060712.34983-1-tian.xianting@h3c.com> <20200828090729.GT1362448@hirez.programming.kicks-ass.net> <20200828094129.GF7072@quack2.suse.cz> <20200828105153.GV1362448@hirez.programming.kicks-ass.net> In-Reply-To: <20200828105153.GV1362448@hirez.programming.kicks-ass.net> Accept-Language: en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.99.141.128] x-sender-location: DAG2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-DNSRBL: X-MAIL: h3cspam01-ex.h3c.com 07SBvFgg038729 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks peterz, jan So, enable aio iowait time accounting is a bad idea:( I gained a lot from you, thanks -----Original Message----- From: peterz@infradead.org [mailto:peterz@infradead.org] Sent: Friday, August 28, 2020 6:52 PM To: Jan Kara Cc: tianxianting (RD) ; viro@zeniv.linux.org.uk; bcrl@kvack.org; mingo@redhat.com; juri.lelli@redhat.com; vincent.guittot@linaro.org; dietmar.eggemann@arm.com; rostedt@goodmis.org; bsegall@google.com; mgorman@suse.de; linux-fsdevel@vger.kernel.org; linux-aio@kvack.org; linux-kernel@vger.kernel.org; Tejun Heo ; hannes@cmpxchg.org Subject: Re: [PATCH] aio: make aio wait path to account iowait time On Fri, Aug 28, 2020 at 11:41:29AM +0200, Jan Kara wrote: > On Fri 28-08-20 11:07:29, peterz@infradead.org wrote: > > On Fri, Aug 28, 2020 at 02:07:12PM +0800, Xianting Tian wrote: > > > As the normal aio wait path(read_events() -> > > > wait_event_interruptible_hrtimeout()) doesn't account iowait time, > > > so use this patch to make it to account iowait time, which can > > > truely reflect the system io situation when using a tool like 'top'. > > > > Do be aware though that io_schedule() is potentially far more > > expensive than regular schedule() and io-wait accounting as a whole > > is a trainwreck. > > Hum, I didn't know that io_schedule() is that much more expensive. > Thanks for info. It's all relative, but it can add up under contention. And since these storage thingies are getting faster every year, I'm assuming these schedule rates are increasing along with it. > > When in_iowait is set schedule() and ttwu() will have to do > > additional atomic ops, and (much) worse, PSI will take additional locks. > > > > And all that for a number that, IMO, is mostly useless, see the > > comment with nr_iowait(). > > Well, I understand the limited usefulness of the system or even per > CPU percentage spent in IO wait. However whether a particular task is > sleeping waiting for IO or not So strict per-task state is not a problem, and we could easily change get_task_state() to distinguish between IO-wait or not, basically duplicate S/D state into an IO-wait variant of the same. Although even this has ABI implications :-( > is IMO a useful diagnostic information and there are several places in > the kernel that take that into account (PSI, hangcheck timer, cpufreq, > ...). So PSI is the one I hate most. We spend an aweful lot of time to not have to take the old rq->lock on wakeup, and PSI reintroduced it for accounting purposes -- I hate accounting overhead. :/ There's a number of high frequency scheduling workloads where it really adds up, which is the reason we got rid of it in the first place. OTOH, PSI gives more sensible numbers, although it goes side-ways when you introduce affinity masks / cpusets. The menu-cpufreq gov is known crazy and we're all hard working on replacing it. And the tick-sched usage is, iirc, the nohz case of iowait. > So I don't see that properly accounting that a task is waiting for IO > is just "expensive random number generator" as you mention below :). > But I'm open to being educated... It's the userspace iowait, and in particular the per-cpu iowait numbers that I hate. Only on UP does any of that make sense. But we can't remove them because ABI :-(