Received: by 2002:a05:6358:53a8:b0:117:f937:c515 with SMTP id z40csp6008274rwe; Tue, 18 Apr 2023 15:18:09 -0700 (PDT) X-Google-Smtp-Source: AKy350bmeURezz0Tq0PdupKFTvdP5bT3hEuVhcssF4KLzranUw9EaMnsONRCOjDRIbJDuFAdcliE X-Received: by 2002:a17:902:d4cf:b0:1a6:81ac:c34d with SMTP id o15-20020a170902d4cf00b001a681acc34dmr261239plg.28.1681856288994; Tue, 18 Apr 2023 15:18:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681856288; cv=none; d=google.com; s=arc-20160816; b=obkv3p7tClFIqpnfDeqBkQ82c5Df0WjIJ+4CXUuqcEJviIi73+WQhiRMQ55IS4Z8ml dAm6Y1mkiucDizwEis3YWZ03l6YIxL2ApQRm+fCD2I6aRl5x5RR0+bWFnzZGIoJgmmwZ 9rKVvrfMA3jv565Bb2+jEKwZULyNvg5uR36xf+guT74CVjp5r/8sdE60c9mBzHv7HZuy ZypfVDktgZLIICeO3GdsdX1EAEBQFtAh310XYqWOEWvLBCEXeRdDKiQ6RzBHhUwISxHu DRp/g2e223L4LXhryYeiM9q+z9+BjRCNoqpWJUK9Y6SChrcJG0I8dj7HTEPM5RQHc6pt rxjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=c4OoVgdT/JlPTUOnnCefJPrx31VmWBmg4aQ12vOZz6M=; b=Q02jdrXne5ZXp7o+z1KStEwwDDeFju7W8rsLFboM+YzvZh4HO7i+fl1lVnqys3jVBE cc6Qku+BB+/SmmTqWer7ARt7zZ5dxtTPCQ1RiQCzYiL8GtH5DVnIxjV4MWfgfyHnG0BF yvB6iMYu/Sb6MeaToILVUD1fY0qwC0FIOUdMMOi2Cwx/o4pErAsTJiNPebmNgDubGnx+ FNLt7JNQlyt3o51OFtOrdnJ+pmBdGuisLNOel8EQo6X5bL1ra0b3DpJBARIeZj1Rt/XW J5eDbrmX79y2mEotoDQb8ckdIv7vx+uvrXat5r53Kf2suKUDBzodF6kxQpjAY2Zj6WAO 8KSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=VkQ5BIVs; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y22-20020a170902b49600b001a6dc4ab8f3si5321894plr.574.2023.04.18.15.17.46; Tue, 18 Apr 2023 15:18:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=VkQ5BIVs; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233217AbjDRWNg (ORCPT + 99 others); Tue, 18 Apr 2023 18:13:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233243AbjDRWN1 (ORCPT ); Tue, 18 Apr 2023 18:13:27 -0400 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3FF7525E for ; Tue, 18 Apr 2023 15:13:17 -0700 (PDT) Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-94a342f7c4cso706357166b.0 for ; Tue, 18 Apr 2023 15:13:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1681855996; x=1684447996; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=c4OoVgdT/JlPTUOnnCefJPrx31VmWBmg4aQ12vOZz6M=; b=VkQ5BIVsAXz8qHPqE1AbuokqcbbwIl019gwdrrvhI6RBdb9tqky4QNBDqa3aZeXugA +bEXjiLfvYEU9oxVFYmKuIqNANetO2Xc/hXAsFKPBVHK/w9TYhg4fQ0+2Xlw30OJ/7dy xAgKsO3NyV3Exdxh9y8Qh6gW7aKQ3gBsAOm1A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681855996; x=1684447996; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c4OoVgdT/JlPTUOnnCefJPrx31VmWBmg4aQ12vOZz6M=; b=lIedKxIrIit0PSlk3hREJUi/LsNvWJrQv0hrWOxtLZoqISIpMQBfc5NY0YYcodl2x8 nruGwxvsqAx0r1eaUMIuUZm/1sM+geQy3yQmmQNmdiWeBSOcT689zfTGjk7ug6fMjEKT T9G5puDfgz7KWGV+H6k4YMt7STirdPHQ9e3Bi0k192voGgpO+v0R3Vla9bepZvwWEdzt m1JQRouo1/AAeiRVbzBdQYp0+JeaZvnklGNd4DUGHw06Qa4pW2Te6qcHSTkGDy5wTnUV ZrpXOos4AaThHxR32NQnusY60C94sUGFbZ46srtK1Yu6tsWFR5YUz+7kxlFokR7X0/Lx YPIg== X-Gm-Message-State: AAQBX9e4QHRnsOt8M6wcFbE023/mMWfsQzQ0+yAVxj/fqImd9YHjJW2a 0XB+Bdc66S428BQ4ORGrw+m9+rRgFAs/HRb/N87Jzw== X-Received: by 2002:aa7:c405:0:b0:4fa:6767:817b with SMTP id j5-20020aa7c405000000b004fa6767817bmr3543467edq.41.1681855996087; Tue, 18 Apr 2023 15:13:16 -0700 (PDT) MIME-Version: 1.0 References: <20221229071647.437095-1-sarthakkukreti@chromium.org> <20230414000219.92640-1-sarthakkukreti@chromium.org> <20230414000219.92640-2-sarthakkukreti@chromium.org> In-Reply-To: From: Sarthak Kukreti Date: Tue, 18 Apr 2023 15:13:05 -0700 Message-ID: Subject: Re: [PATCH v3 1/3] block: Introduce provisioning primitives To: Brian Foster Cc: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , "Theodore Ts'o" , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, Apr 17, 2023 at 10:33=E2=80=AFAM Brian Foster = wrote: > > On Thu, Apr 13, 2023 at 05:02:17PM -0700, Sarthak Kukreti wrote: > > Introduce block request REQ_OP_PROVISION. The intent of this request > > is to request underlying storage to preallocate disk space for the give= n > > block range. Block devices that support this capability will export > > a provision limit within their request queues. > > > > This patch also adds the capability to call fallocate() in mode 0 > > on block devices, which will send REQ_OP_PROVISION to the block > > device for the specified range, > > > > Signed-off-by: Sarthak Kukreti > > --- > > block/blk-core.c | 5 ++++ > > block/blk-lib.c | 53 +++++++++++++++++++++++++++++++++++++++ > > block/blk-merge.c | 18 +++++++++++++ > > block/blk-settings.c | 19 ++++++++++++++ > > block/blk-sysfs.c | 8 ++++++ > > block/bounce.c | 1 + > > block/fops.c | 14 ++++++++--- > > include/linux/bio.h | 6 +++-- > > include/linux/blk_types.h | 5 +++- > > include/linux/blkdev.h | 16 ++++++++++++ > > 10 files changed, 138 insertions(+), 7 deletions(-) > > > ... > > diff --git a/block/fops.c b/block/fops.c > > index d2e6be4e3d1c..f82da2fb8af0 100644 > > --- a/block/fops.c > > +++ b/block/fops.c > > @@ -625,7 +625,7 @@ static long blkdev_fallocate(struct file *file, int= mode, loff_t start, > > int error; > > > > /* Fail if we don't recognize the flags. */ > > - if (mode & ~BLKDEV_FALLOC_FL_SUPPORTED) > > + if (mode !=3D 0 && mode & ~BLKDEV_FALLOC_FL_SUPPORTED) > > return -EOPNOTSUPP; > > > > /* Don't go off the end of the device. */ > > @@ -649,11 +649,17 @@ static long blkdev_fallocate(struct file *file, i= nt mode, loff_t start, > > filemap_invalidate_lock(inode->i_mapping); > > > > /* Invalidate the page cache, including dirty pages. */ > > - error =3D truncate_bdev_range(bdev, file->f_mode, start, end); > > - if (error) > > - goto fail; > > + if (mode !=3D 0) { > > + error =3D truncate_bdev_range(bdev, file->f_mode, start, = end); > > + if (error) > > + goto fail; > > + } > > > > switch (mode) { > > + case 0: > > + error =3D blkdev_issue_provision(bdev, start >> SECTOR_SH= IFT, > > + len >> SECTOR_SHIFT, GFP_K= ERNEL); > > + break; > > I would think we'd want to support any combination of > FALLOC_FL_KEEP_SIZE and FALLOC_FL_UNSHARE_RANGE..? All of the other > commands support the former modifier, for one. It also looks like if > somebody attempts to invoke with mode =3D=3D FALLOC_FL_KEEP_SIZE, even wi= th > the current upstream code that would perform the bdev truncate before > returning -EOPNOTSUPP. That seems like a bit of an unfortunate side > effect to me. > Added a separate flag set to decide whether we should truncate or not. > WRT to unshare, if the PROVISION request is always going to imply an > unshare (which seems reasonable to me), there's probably no reason to > -EOPNOTSUPP if a caller explicitly passes UNSHARE_RANGE. > Added handling in v4. Thanks! Sarthak > Brian > > > case FALLOC_FL_ZERO_RANGE: > > case FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE: > > error =3D blkdev_issue_zeroout(bdev, start >> SECTOR_SHIF= T, > > diff --git a/include/linux/bio.h b/include/linux/bio.h > > index d766be7152e1..9820b3b039f2 100644 > > --- a/include/linux/bio.h > > +++ b/include/linux/bio.h > > @@ -57,7 +57,8 @@ static inline bool bio_has_data(struct bio *bio) > > bio->bi_iter.bi_size && > > bio_op(bio) !=3D REQ_OP_DISCARD && > > bio_op(bio) !=3D REQ_OP_SECURE_ERASE && > > - bio_op(bio) !=3D REQ_OP_WRITE_ZEROES) > > + bio_op(bio) !=3D REQ_OP_WRITE_ZEROES && > > + bio_op(bio) !=3D REQ_OP_PROVISION) > > return true; > > > > return false; > > @@ -67,7 +68,8 @@ static inline bool bio_no_advance_iter(const struct b= io *bio) > > { > > return bio_op(bio) =3D=3D REQ_OP_DISCARD || > > bio_op(bio) =3D=3D REQ_OP_SECURE_ERASE || > > - bio_op(bio) =3D=3D REQ_OP_WRITE_ZEROES; > > + bio_op(bio) =3D=3D REQ_OP_WRITE_ZEROES || > > + bio_op(bio) =3D=3D REQ_OP_PROVISION; > > } > > > > static inline void *bio_data(struct bio *bio) > > diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h > > index 99be590f952f..27bdf88f541c 100644 > > --- a/include/linux/blk_types.h > > +++ b/include/linux/blk_types.h > > @@ -385,7 +385,10 @@ enum req_op { > > REQ_OP_DRV_IN =3D (__force blk_opf_t)34, > > REQ_OP_DRV_OUT =3D (__force blk_opf_t)35, > > > > - REQ_OP_LAST =3D (__force blk_opf_t)36, > > + /* request device to provision block */ > > + REQ_OP_PROVISION =3D (__force blk_opf_t)37, > > + > > + REQ_OP_LAST =3D (__force blk_opf_t)38, > > }; > > > > enum req_flag_bits { > > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > > index 941304f17492..239e2f418b6e 100644 > > --- a/include/linux/blkdev.h > > +++ b/include/linux/blkdev.h > > @@ -303,6 +303,7 @@ struct queue_limits { > > unsigned int discard_granularity; > > unsigned int discard_alignment; > > unsigned int zone_write_granularity; > > + unsigned int max_provision_sectors; > > > > unsigned short max_segments; > > unsigned short max_integrity_segments; > > @@ -921,6 +922,8 @@ extern void blk_queue_max_discard_sectors(struct re= quest_queue *q, > > unsigned int max_discard_sectors); > > extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q= , > > unsigned int max_write_same_sectors); > > +extern void blk_queue_max_provision_sectors(struct request_queue *q, > > + unsigned int max_provision_sectors); > > extern void blk_queue_logical_block_size(struct request_queue *, unsig= ned int); > > extern void blk_queue_max_zone_append_sectors(struct request_queue *q, > > unsigned int max_zone_append_sectors); > > @@ -1060,6 +1063,9 @@ int __blkdev_issue_discard(struct block_device *b= dev, sector_t sector, > > int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sect= or, > > sector_t nr_sects, gfp_t gfp); > > > > +extern int blkdev_issue_provision(struct block_device *bdev, sector_t = sector, > > + sector_t nr_sects, gfp_t gfp_mask); > > + > > #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ > > #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit= zeroes */ > > > > @@ -1139,6 +1145,11 @@ static inline unsigned short queue_max_discard_s= egments(const struct request_que > > return q->limits.max_discard_segments; > > } > > > > +static inline unsigned short queue_max_provision_sectors(const struct = request_queue *q) > > +{ > > + return q->limits.max_provision_sectors; > > +} > > + > > static inline unsigned int queue_max_segment_size(const struct request= _queue *q) > > { > > return q->limits.max_segment_size; > > @@ -1281,6 +1292,11 @@ static inline bool bdev_nowait(struct block_devi= ce *bdev) > > return test_bit(QUEUE_FLAG_NOWAIT, &bdev_get_queue(bdev)->queue_f= lags); > > } > > > > +static inline unsigned int bdev_max_provision_sectors(struct block_dev= ice *bdev) > > +{ > > + return bdev_get_queue(bdev)->limits.max_provision_sectors; > > +} > > + > > static inline enum blk_zoned_model bdev_zoned_model(struct block_devic= e *bdev) > > { > > return blk_queue_zoned_model(bdev_get_queue(bdev)); > > -- > > 2.40.0.634.g4ca3ef3211-goog > > >