Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752422AbbGNPCd (ORCPT ); Tue, 14 Jul 2015 11:02:33 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:12922 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752306AbbGNPCb (ORCPT ); Tue, 14 Jul 2015 11:02:31 -0400 From: Jens Axboe To: CC: , , Jens Axboe Subject: [PATCH 2/2] block: make /sys/block//queue/discard_max_bytes writeable Date: Tue, 14 Jul 2015 09:02:19 -0600 Message-ID: <1436886139-18673-3-git-send-email-axboe@fb.com> X-Mailer: git-send-email 2.4.1.168.g1ea28e1 In-Reply-To: <1436886139-18673-1-git-send-email-axboe@fb.com> References: <1436886139-18673-1-git-send-email-axboe@fb.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151,1.0.33,0.0.0000 definitions=2015-07-14_07:2015-07-14,2015-07-14,1970-01-01 signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5237 Lines: 133 Lots of devices support huge discard sizes these days. Depending on how the device handles them internally, huge discards can introduce massive latencies (hundreds of msec) on the device side. We have a sysfs file, discard_max_bytes, that advertises the max hardware supported discard size. Make this writeable, and split the settings into a soft and hard limit. This can be set from 'discard_granularity' and up to the hardware limit. Signed-off-by: Jens Axboe --- Documentation/block/queue-sysfs.txt | 4 +++- block/blk-settings.c | 4 ++++ block/blk-sysfs.c | 26 +++++++++++++++++++++++++- include/linux/blkdev.h | 1 + 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/Documentation/block/queue-sysfs.txt b/Documentation/block/queue-sysfs.txt index 3a29f8914df9..3748cf827131 100644 --- a/Documentation/block/queue-sysfs.txt +++ b/Documentation/block/queue-sysfs.txt @@ -20,7 +20,7 @@ This shows the size of internal allocation of the device in bytes, if reported by the device. A value of '0' means device does not support the discard functionality. -discard_max_bytes (RO) +discard_max_bytes (RW) ---------------------- Devices that support discard functionality may have internal limits on the number of bytes that can be trimmed or unmapped in a single operation. @@ -28,6 +28,8 @@ The discard_max_bytes parameter is set by the device driver to the maximum number of bytes that can be discarded in a single operation. Discard requests issued to the device must not exceed this limit. A discard_max_bytes value of 0 means that the device does not support discard functionality. +Writing a lower value to this file can limit the maximum discard size issued +to the device, which can help latencies. discard_zeroes_data (RO) ------------------------ diff --git a/block/blk-settings.c b/block/blk-settings.c index 12600bfffca9..b38d8d723276 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -116,6 +116,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->chunk_sectors = 0; lim->max_write_same_sectors = 0; lim->max_discard_sectors = 0; + lim->max_hw_discard_sectors = 0; lim->discard_granularity = 0; lim->discard_alignment = 0; lim->discard_misaligned = 0; @@ -303,6 +304,7 @@ EXPORT_SYMBOL(blk_queue_chunk_sectors); void blk_queue_max_discard_sectors(struct request_queue *q, unsigned int max_discard_sectors) { + q->limits.max_hw_discard_sectors = max_discard_sectors; q->limits.max_discard_sectors = max_discard_sectors; } EXPORT_SYMBOL(blk_queue_max_discard_sectors); @@ -641,6 +643,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_discard_sectors = min_not_zero(t->max_discard_sectors, b->max_discard_sectors); + t->max_hw_discard_sectors = min_not_zero(t->max_hw_discard_sectors, + b->max_hw_discard_sectors); t->discard_granularity = max(t->discard_granularity, b->discard_granularity); t->discard_alignment = lcm_not_zero(t->discard_alignment, alignment) % diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 6264b382d4d1..3d1dba600228 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -151,6 +151,29 @@ static ssize_t queue_discard_max_show(struct request_queue *q, char *page) (unsigned long long)q->limits.max_discard_sectors << 9); } +static ssize_t queue_discard_max_store(struct request_queue *q, + const char *page, size_t count) +{ + unsigned long max_discard; + ssize_t ret = queue_var_store(&max_discard, page, count); + + if (ret < 0) + return ret; + + if (max_discard & (q->limits.discard_granularity - 1)) + return -EINVAL; + + max_discard >>= 9; + if (max_discard > UINT_MAX) + return -EINVAL; + + if (max_discard > q->limits.max_hw_discard_sectors) + max_discard = q->limits.max_hw_discard_sectors; + + q->limits.max_discard_sectors = max_discard; + return ret; +} + static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char *page) { return queue_var_show(queue_discard_zeroes_data(q), page); @@ -361,8 +384,9 @@ static struct queue_sysfs_entry queue_discard_granularity_entry = { }; static struct queue_sysfs_entry queue_discard_max_entry = { - .attr = {.name = "discard_max_bytes", .mode = S_IRUGO }, + .attr = {.name = "discard_max_bytes", .mode = S_IRUGO | S_IWUSR }, .show = queue_discard_max_show, + .store = queue_discard_max_store, }; static struct queue_sysfs_entry queue_discard_zeroes_data_entry = { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d4068c17d0df..243f29e779ec 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -268,6 +268,7 @@ struct queue_limits { unsigned int io_min; unsigned int io_opt; unsigned int max_discard_sectors; + unsigned int max_hw_discard_sectors; unsigned int max_write_same_sectors; unsigned int discard_granularity; unsigned int discard_alignment; -- 2.4.1.168.g1ea28e1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/