Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp4517256ybb; Mon, 23 Mar 2020 23:41:36 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuzDtz6dYStpHjjepXAoJhXsj5kCZN7vDN+8Rpzzd71VF4ZErELJlIBSDcosP+uVJSalOXb X-Received: by 2002:a05:6808:16:: with SMTP id u22mr2351250oic.127.1585032096625; Mon, 23 Mar 2020 23:41:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585032096; cv=none; d=google.com; s=arc-20160816; b=nbd3Ee2XSnnkE7lQH9CRg0ISpsBeecsnynuQyy5xMN0wqNGDohDMKazORTHofUj9BQ gJzwCrtu6SpAB2iBwDI6yANspnHOPMtuT8lPj67ZnVXpED4i0e8HmrzJw2cSCkDy+uJ4 mF6WEv/Eht3cMWTr1dOx7ockgt53aHb+KtX22IsuK3HpFahmnNZdNyfHC7Gw2IKf18+4 9vq91Xi0hd3R5X4WXJ0RsJrdc0vYXLAczFDif5Bkp6nfoecfo2xemZxxYiElp5Arsudy APT0GQxKSYgYLdQuBQG0WnvC6McdWyB3bR6mP7fUaV29DhQktEDIl/haLM3AeUYw2ULh d+RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:dkim-signature; bh=zh5H7xtrpjC0B6xsaGoNkXsuvDRcxQzlG8ZeX056csA=; b=ZpRPJMOgrlXetgwXizAVSFbwYCSACNN0jB5WKHN9L7S+xkfUkPsE+LmgcT3zC98DJ1 HSb1SKyn7Ip+ETVPwS+shgER/DPdGMG83Y8/IyMKGKedewBypa7ZYdLV3lkfPFGXhDTa 5LSix7nVWmgMb9kRsFqlMLhbayujQH9t/swgCotkZ7Qct4bVEuWQxO24QXAF25ERGISZ aG/4RMzAjg2xxajT9VXr+qC1Vn9MXRj5vOpw45jJ2ZrCqYsRLkbqRiMuRVcrMWohj/h3 m9agTfFAEl1SorM0VLHueUodEbrmE6+WVa2jEJ1e7Q80KsrrWIaDRGBcjR9bmu/RjXvo H09w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=IUxsI2KO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x6si2189939oto.92.2020.03.23.23.41.24; Mon, 23 Mar 2020 23:41:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=IUxsI2KO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727389AbgCXGju (ORCPT + 99 others); Tue, 24 Mar 2020 02:39:50 -0400 Received: from forwardcorp1p.mail.yandex.net ([77.88.29.217]:49282 "EHLO forwardcorp1p.mail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727328AbgCXGjt (ORCPT ); Tue, 24 Mar 2020 02:39:49 -0400 Received: from mxbackcorp1g.mail.yandex.net (mxbackcorp1g.mail.yandex.net [IPv6:2a02:6b8:0:1402::301]) by forwardcorp1p.mail.yandex.net (Yandex) with ESMTP id 05E5A2E151F; Tue, 24 Mar 2020 09:39:45 +0300 (MSK) Received: from myt4-18a966dbd9be.qloud-c.yandex.net (myt4-18a966dbd9be.qloud-c.yandex.net [2a02:6b8:c00:12ad:0:640:18a9:66db]) by mxbackcorp1g.mail.yandex.net (mxbackcorp/Yandex) with ESMTP id rPzhyygJ86-di4q6Os5; Tue, 24 Mar 2020 09:39:45 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1585031985; bh=zh5H7xtrpjC0B6xsaGoNkXsuvDRcxQzlG8ZeX056csA=; h=In-Reply-To:Message-ID:References:Date:To:From:Subject:Cc; b=IUxsI2KOyuK0KER3goVmBkGkUm4WcVlHZmPoyuNGHZlVzEujLMGVor9N57gC3LLng StV3tOLXGvPJKcLu36pdCEkeOjfIUdFbD5cMnHxKuPYiSG+TsJD7VzZ1+gwunphc9S hJMhwGQoH6xmvDiKGlhpgKRijnBuXKXK25uY8+4Y= Authentication-Results: mxbackcorp1g.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Received: from unknown (unknown [2a02:6b8:b080:6410::1:2]) by myt4-18a966dbd9be.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id oSRgcI6OWu-diaaVDYI; Tue, 24 Mar 2020 09:39:44 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) Subject: [PATCH v3 2/3] block/diskstats: accumulate all per-cpu counters in one pass From: Konstantin Khlebnikov To: linux-block@vger.kernel.org, Jens Axboe , linux-kernel@vger.kernel.org Cc: Mikulas Patocka , Mike Snitzer , Ming Lei Date: Tue, 24 Mar 2020 09:39:43 +0300 Message-ID: <158503198306.1955.15150686320152629671.stgit@buzz> In-Reply-To: <158503038812.1955.7827988255138056389.stgit@buzz> References: <158503038812.1955.7827988255138056389.stgit@buzz> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Reading /proc/diskstats iterates over all cpus for summing each field. It's faster to sum all fields in one pass. Hammering /proc/diskstats with fio shows 2x performance improvement: fio --name=test --numjobs=$JOBS --filename=/proc/diskstats \ --size=1k --bs=1k --fallocate=none --create_on_open=1 \ --time_based=1 --runtime=10 --invalidate=0 --group_report JOBS=1 JOBS=10 Before: 7k iops 64k iops After: 18k iops 120k iops Also this way code is more compact: add/remove: 1/0 grow/shrink: 0/2 up/down: 194/-1540 (-1346) Function old new delta part_stat_read_all - 194 +194 diskstats_show 1344 631 -713 part_stat_show 1219 392 -827 Total: Before=14966947, After=14965601, chg -0.01% Signed-off-by: Konstantin Khlebnikov --- block/genhd.c | 61 +++++++++++++++++++++++++++++++++------------ block/partition-generic.c | 35 ++++++++++++++------------ include/linux/genhd.h | 11 ++++++-- 3 files changed, 72 insertions(+), 35 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index b210c12c4870..606e8755f6ed 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -69,6 +69,28 @@ void set_capacity_revalidate_and_notify(struct gendisk *disk, sector_t size, EXPORT_SYMBOL_GPL(set_capacity_revalidate_and_notify); +#ifdef CONFIG_SMP +void part_stat_read_all(struct hd_struct *part, struct disk_stats *stat) +{ + int cpu; + + memset(stat, 0, sizeof(struct disk_stats)); + for_each_possible_cpu(cpu) { + struct disk_stats *ptr = per_cpu_ptr(part->dkstats, cpu); + int group; + + for (group = 0; group < NR_STAT_GROUPS; group++) { + stat->nsecs[group] += ptr->nsecs[group]; + stat->sectors[group] += ptr->sectors[group]; + stat->ios[group] += ptr->ios[group]; + stat->merges[group] += ptr->merges[group]; + } + + stat->io_ticks += ptr->io_ticks; + stat->time_in_queue += ptr->time_in_queue; + } +} +#endif /* CONFIG_SMP */ void part_inc_in_flight(struct request_queue *q, struct hd_struct *part, int rw) { @@ -1429,6 +1451,7 @@ static int diskstats_show(struct seq_file *seqf, void *v) struct hd_struct *hd; char buf[BDEVNAME_SIZE]; unsigned int inflight; + struct disk_stats stat; /* if (&disk_to_dev(gp)->kobj.entry == block_class.devices.next) @@ -1440,7 +1463,9 @@ static int diskstats_show(struct seq_file *seqf, void *v) disk_part_iter_init(&piter, gp, DISK_PITER_INCL_EMPTY_PART0); while ((hd = disk_part_iter_next(&piter))) { + part_stat_read_all(hd, &stat); inflight = part_in_flight(gp->queue, hd); + seq_printf(seqf, "%4d %7d %s " "%lu %lu %lu %u " "%lu %lu %lu %u " @@ -1450,23 +1475,27 @@ static int diskstats_show(struct seq_file *seqf, void *v) "\n", MAJOR(part_devt(hd)), MINOR(part_devt(hd)), disk_name(gp, hd->partno, buf), - part_stat_read(hd, ios[STAT_READ]), - part_stat_read(hd, merges[STAT_READ]), - part_stat_read(hd, sectors[STAT_READ]), - (unsigned int)part_stat_read_msecs(hd, STAT_READ), - part_stat_read(hd, ios[STAT_WRITE]), - part_stat_read(hd, merges[STAT_WRITE]), - part_stat_read(hd, sectors[STAT_WRITE]), - (unsigned int)part_stat_read_msecs(hd, STAT_WRITE), + stat.ios[STAT_READ], + stat.merges[STAT_READ], + stat.sectors[STAT_READ], + (unsigned int)div_u64(stat.nsecs[STAT_READ], + NSEC_PER_MSEC), + stat.ios[STAT_WRITE], + stat.merges[STAT_WRITE], + stat.sectors[STAT_WRITE], + (unsigned int)div_u64(stat.nsecs[STAT_WRITE], + NSEC_PER_MSEC), inflight, - jiffies_to_msecs(part_stat_read(hd, io_ticks)), - jiffies_to_msecs(part_stat_read(hd, time_in_queue)), - part_stat_read(hd, ios[STAT_DISCARD]), - part_stat_read(hd, merges[STAT_DISCARD]), - part_stat_read(hd, sectors[STAT_DISCARD]), - (unsigned int)part_stat_read_msecs(hd, STAT_DISCARD), - part_stat_read(hd, ios[STAT_FLUSH]), - (unsigned int)part_stat_read_msecs(hd, STAT_FLUSH) + jiffies_to_msecs(stat.io_ticks), + jiffies_to_msecs(stat.time_in_queue), + stat.ios[STAT_DISCARD], + stat.merges[STAT_DISCARD], + stat.sectors[STAT_DISCARD], + (unsigned int)div_u64(stat.nsecs[STAT_DISCARD], + NSEC_PER_MSEC), + stat.ios[STAT_FLUSH], + (unsigned int)div_u64(stat.nsecs[STAT_FLUSH], + NSEC_PER_MSEC) ); } disk_part_iter_exit(&piter); diff --git a/block/partition-generic.c b/block/partition-generic.c index 564fae77711d..6d3fcb5187cb 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -120,9 +120,12 @@ ssize_t part_stat_show(struct device *dev, { struct hd_struct *p = dev_to_part(dev); struct request_queue *q = part_to_disk(p)->queue; + struct disk_stats stat; unsigned int inflight; + part_stat_read_all(p, &stat); inflight = part_in_flight(q, p); + return sprintf(buf, "%8lu %8lu %8llu %8u " "%8lu %8lu %8llu %8u " @@ -130,23 +133,23 @@ ssize_t part_stat_show(struct device *dev, "%8lu %8lu %8llu %8u " "%8lu %8u" "\n", - part_stat_read(p, ios[STAT_READ]), - part_stat_read(p, merges[STAT_READ]), - (unsigned long long)part_stat_read(p, sectors[STAT_READ]), - (unsigned int)part_stat_read_msecs(p, STAT_READ), - part_stat_read(p, ios[STAT_WRITE]), - part_stat_read(p, merges[STAT_WRITE]), - (unsigned long long)part_stat_read(p, sectors[STAT_WRITE]), - (unsigned int)part_stat_read_msecs(p, STAT_WRITE), + stat.ios[STAT_READ], + stat.merges[STAT_READ], + (unsigned long long)stat.sectors[STAT_READ], + (unsigned int)div_u64(stat.nsecs[STAT_READ], NSEC_PER_MSEC), + stat.ios[STAT_WRITE], + stat.merges[STAT_WRITE], + (unsigned long long)stat.sectors[STAT_WRITE], + (unsigned int)div_u64(stat.nsecs[STAT_WRITE], NSEC_PER_MSEC), inflight, - jiffies_to_msecs(part_stat_read(p, io_ticks)), - jiffies_to_msecs(part_stat_read(p, time_in_queue)), - part_stat_read(p, ios[STAT_DISCARD]), - part_stat_read(p, merges[STAT_DISCARD]), - (unsigned long long)part_stat_read(p, sectors[STAT_DISCARD]), - (unsigned int)part_stat_read_msecs(p, STAT_DISCARD), - part_stat_read(p, ios[STAT_FLUSH]), - (unsigned int)part_stat_read_msecs(p, STAT_FLUSH)); + jiffies_to_msecs(stat.io_ticks), + jiffies_to_msecs(stat.time_in_queue), + stat.ios[STAT_DISCARD], + stat.merges[STAT_DISCARD], + (unsigned long long)stat.sectors[STAT_DISCARD], + (unsigned int)div_u64(stat.nsecs[STAT_DISCARD], NSEC_PER_MSEC), + stat.ios[STAT_FLUSH], + (unsigned int)div_u64(stat.nsecs[STAT_FLUSH], NSEC_PER_MSEC)); } ssize_t part_inflight_show(struct device *dev, struct device_attribute *attr, diff --git a/include/linux/genhd.h b/include/linux/genhd.h index f1066f10b062..ead3ffb7f327 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -375,6 +375,8 @@ static inline void part_stat_set_all(struct hd_struct *part, int value) sizeof(struct disk_stats)); } +void part_stat_read_all(struct hd_struct *part, struct disk_stats *stat); + static inline int init_part_stats(struct hd_struct *part) { part->dkstats = alloc_percpu(struct disk_stats); @@ -401,6 +403,12 @@ static inline void part_stat_set_all(struct hd_struct *part, int value) memset(&part->dkstats, value, sizeof(struct disk_stats)); } +static inline void part_stat_read_all(struct hd_struct *part, + struct disk_stats *stat) +{ + memcpy(stat, &part->dkstats, sizeof(struct disk_stats)); +} + static inline int init_part_stats(struct hd_struct *part) { return 1; @@ -412,9 +420,6 @@ static inline void free_part_stats(struct hd_struct *part) #endif /* CONFIG_SMP */ -#define part_stat_read_msecs(part, which) \ - div_u64(part_stat_read(part, nsecs[which]), NSEC_PER_MSEC) - #define part_stat_read_accum(part, field) \ (part_stat_read(part, field[STAT_READ]) + \ part_stat_read(part, field[STAT_WRITE]) + \