Received: by 2002:a05:7412:8d11:b0:fa:4934:9f with SMTP id bj17csp475446rdb; Mon, 15 Jan 2024 03:38:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IFd0M7k9wQWUFZix1Chv/KLlo1Vi9MgWPdAjL/wNiiClharW/wwlIWg3g3kv2EItjW18t5g X-Received: by 2002:a05:620a:22da:b0:778:ba89:2fbd with SMTP id o26-20020a05620a22da00b00778ba892fbdmr5725944qki.36.1705318713646; Mon, 15 Jan 2024 03:38:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705318713; cv=none; d=google.com; s=arc-20160816; b=W4Jl8oGVecpGJg83WKZCO5xnAyWNEUB+GOdfYPJGrHzymmN8bB4uF7b51UVpRhWykd M19YLDW9jfNCKPBpIM20InanTbQJ+XmZf3OoV2z9T+iN+chp9CCHRp4fx8LX2hTJpA66 BtVGeVBxtmlXxx91JfbU1DLPe8J5Gupq1jrSWTjiMHeiVUjCsYtHSP2VU6ndwgWdiqki uOo0d3EjwDY+Ro09JvobvVcbfSZO6NdgwjJ+Ri+11BYLTgpl4bpAJKA9ivJv8QTrQFRM KJc4NbnIFq1QsjBBPf80hVLDf3hFemf3hStj+zA+wyUVP0o7gKbqyKImW4ernhUh5YKb 1QrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=kvr1H4HWot08qDvWwNjLa4kBqu8J0KtTgh8LnyVOj1c=; fh=Dj6bZU10rWEkaJ8QA4oUlMmNGrkUKfbGsp3mVuXMQO4=; b=qa8OEk2TPOTu8py8g8+3w36L6eKM1mgxj+l+ufwB7/aTfvRbKZfzDyTExFd76fMtcw E0UGsHDqNbEIC2uGgELFOYXejVPWu3h8QYDyEjPH8MqUEQ4jeksZ66ftP2z7EXq5eJFa bbOLfFddKMF6/82SzwFdHJca12vPMUo+CUM/wr0eCyBCFV0bYaEJGnoJWjd9hSupaYsk rslA5ijfPENnerM+fYef8FHXXzzxCUlIVntdJXn4jZ21hhd+74hlKVGSGiyPLouLfUOg c5GEUhOtFZeLt1lBRQbTpPBxaQi52g1gC1OTwaH1m2teH0KE/uK6heKIprK0KtsqRc8h HBXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PFOkDXeF; spf=pass (google.com: domain of linux-kernel+bounces-25930-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25930-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id c18-20020a05620a165200b0078320568c0esi7756532qko.743.2024.01.15.03.38.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 03:38:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-25930-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PFOkDXeF; spf=pass (google.com: domain of linux-kernel+bounces-25930-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25930-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 677261C2198E for ; Mon, 15 Jan 2024 11:38:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F3BD72C697; Mon, 15 Jan 2024 11:38:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PFOkDXeF" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5925E2C683 for ; Mon, 15 Jan 2024 11:38:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705318702; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=kvr1H4HWot08qDvWwNjLa4kBqu8J0KtTgh8LnyVOj1c=; b=PFOkDXeFUzgWcJcUXd/4cfCYpBdc9QJb/sNFXK4i+eHBuYSRvOnBG23AF91XjRVjF79c1Z 7V7dR3f3aib8ahHgKklJfx1dPAydugm1dJnbx0EGMXw3R1mluNldIjXWVEIWcEAWUUbm8h gfjE/AXXJxVGK4DUxwJBDypjn1Xj4Po= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-329-BdaA_Z60PCOcXjrNB-qauQ-1; Mon, 15 Jan 2024 06:38:17 -0500 X-MC-Unique: BdaA_Z60PCOcXjrNB-qauQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D2F2C3C29A61; Mon, 15 Jan 2024 11:38:16 +0000 (UTC) Received: from fedora (unknown [10.72.116.28]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1C8E0492BFA; Mon, 15 Jan 2024 11:38:11 +0000 (UTC) Date: Mon, 15 Jan 2024 19:38:07 +0800 From: Ming Lei To: Yu Kuai Cc: hch@lst.de, bvanassche@acm.org, axboe@kernel.dk, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: Re: [PATCH for-6.8/block] block: support to account io_ticks precisely Message-ID: References: <20240109071332.2216253-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240109071332.2216253-1-yukuai1@huaweicloud.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 On Tue, Jan 09, 2024 at 03:13:32PM +0800, Yu Kuai wrote: > From: Yu Kuai > > Currently, io_ticks is accounted based on sampling, specifically > update_io_ticks() will always account io_ticks by 1 jiffies from > bdev_start_io_acct()/blk_account_io_start(), and the result can be > inaccurate, for example(HZ is 250): > > Test script: > fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms > > Test result: util is about 90%, while the disk is really idle. Just be curious, what is result with this patch? 0%? > > In order to account io_ticks precisely, update_io_ticks() must know if > there are IO inflight already, and this requires overhead slightly, > hence precise io accounting is disabled by default, and user can enable > it through sysfs entry. > > Noted that for rq-based devcie, part_stat_local_inc/dec() and > part_in_flight() is used to track inflight instead of iterating tags, > which is not supposed to be used in fast path because 'tags->lock' is > grabbed in blk_mq_find_and_get_req(). > > Signed-off-by: Yu Kuai > --- > Changes from RFC v1: > - remove the new parameter for update_io_ticks(); > - simplify update_io_ticks(); > - use swith in queue_iostats_store(); > - add missing part_stat_local_dec() in blk_account_io_merge_request(); > Changes from RFC v2: > - fix that precise is ignored for the first io in update_io_ticks(); > > Documentation/ABI/stable/sysfs-block | 8 ++++-- > block/blk-core.c | 10 +++++-- > block/blk-merge.c | 3 ++ > block/blk-mq-debugfs.c | 2 ++ > block/blk-mq.c | 11 +++++++- > block/blk-sysfs.c | 42 ++++++++++++++++++++++++++-- > block/blk.h | 1 + > block/genhd.c | 2 +- > include/linux/blk-mq.h | 1 + > include/linux/blkdev.h | 3 ++ > 10 files changed, 74 insertions(+), 9 deletions(-) > > diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block > index 1fe9a553c37b..79027bf2661a 100644 > --- a/Documentation/ABI/stable/sysfs-block > +++ b/Documentation/ABI/stable/sysfs-block > @@ -358,8 +358,12 @@ What: /sys/block//queue/iostats > Date: January 2009 > Contact: linux-block@vger.kernel.org > Description: > - [RW] This file is used to control (on/off) the iostats > - accounting of the disk. > + [RW] This file is used to control the iostats accounting of the > + disk. If this value is 0, iostats accounting is disabled; If > + this value is 1, iostats accounting is enabled, but io_ticks is > + accounted by sampling and the result is not accurate; If this > + value is 2, iostats accounting is enabled and io_ticks is > + accounted precisely, but there will be slightly more overhead. > > > What: /sys/block//queue/logical_block_size > diff --git a/block/blk-core.c b/block/blk-core.c > index 9520ccab3050..c70dc311e3b7 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -954,11 +954,15 @@ EXPORT_SYMBOL_GPL(iocb_bio_iopoll); > void update_io_ticks(struct block_device *part, unsigned long now, bool end) > { > unsigned long stamp; > + bool precise = blk_queue_precise_io_stat(part->bd_queue); > again: > stamp = READ_ONCE(part->bd_stamp); > - if (unlikely(time_after(now, stamp))) { > - if (likely(try_cmpxchg(&part->bd_stamp, &stamp, now))) > - __part_stat_add(part, io_ticks, end ? now - stamp : 1); > + if (unlikely(time_after(now, stamp)) && > + likely(try_cmpxchg(&part->bd_stamp, &stamp, now))) { > + if (end || (precise && part_in_flight(part))) > + __part_stat_add(part, io_ticks, now - stamp); > + else if (!precise) > + __part_stat_add(part, io_ticks, 1); It should be better or readable to move 'bool precise' into the above branch, given we only need to read the flag once in each tick. Otherwise, this patch looks fine. Thanks, Ming