Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2305729imu; Wed, 21 Nov 2018 09:33:28 -0800 (PST) X-Google-Smtp-Source: AFSGD/UXme5VfFRUtXa2UKnmDHB0mBGwFQQ5ixPM0nJWdmbOdOTFBsMHWpicg3EETYVxrndGuIex X-Received: by 2002:a63:c70d:: with SMTP id n13mr2395903pgg.108.1542821608520; Wed, 21 Nov 2018 09:33:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542821608; cv=none; d=google.com; s=arc-20160816; b=LzdmECmye9WQlJf5XAS6GFcqq7cE2K2dLG9TmnCVxC/ikDzQjRSoabbskMsUpL2eKV WH4VV00YnnmAHmgg3t5i9bdXQQDVt5XVn54hH3GNNCr1rEdcyCnrWvEYeiXUsuvD+eSi 2umXX4cg+AC6fV1M0j+2fLXd01rZ3Q3lFOoJd/6W2YIFey/P5sHuJ5+0GCAtIsNtyp88 6tcdZhrYjY922E4plGwwAedhJJPg+8tHi3AlQN87yNHTF0K94YuDzT6wi+UDotCkJKyi Tncxy5eXATrN3SHnrLb2eLeKBA7r2rXaH9Ymxr0Ez+VQDSNzfsdNrNuGLo034b+29+++ mttQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=PZJ7SvtH1y3Z5o67CgTDUMEHSqsFPQEDj7OAI52FECM=; b=jfJH5cOPQW8vZDOhEVEWbbZHd6GfIh9O0Q96P6cl1S8upleYJQy1wNg3HL/MWwyEry PHGQoLQUoHMG9rgXh+O256pbmhmPMKePMETOa93RfK0b9augYIaNLzm8AeDYqHuSZZTv z91tGaxshiZkFvQkFhA8vfUo9LWBhatCbxAWD2xAEEq+eZaPoQaORJqoZkGRmMwlvAvR 9gYvikAR9e5ZQJpsQ8KhPR2BY6Y5K5isQyH6vjhKyrvTGpyhiDx1EStG+b+IkWwpSBL7 KrwU1SvOSIJoe+AmPUee32Mimimf0axohKTH0YxpQTei2/azTrSu5YIhUQJoYFYn6QuG NOQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t5si41176670pgc.369.2018.11.21.09.33.06; Wed, 21 Nov 2018 09:33:28 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731042AbeKVBIi (ORCPT + 99 others); Wed, 21 Nov 2018 20:08:38 -0500 Received: from verein.lst.de ([213.95.11.211]:51717 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726778AbeKVBIh (ORCPT ); Wed, 21 Nov 2018 20:08:37 -0500 Received: by newverein.lst.de (Postfix, from userid 2407) id E98EB68C19; Wed, 21 Nov 2018 15:33:55 +0100 (CET) Date: Wed, 21 Nov 2018 15:33:55 +0100 From: Christoph Hellwig To: Ming Lei Cc: Jens Axboe , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Theodore Ts'o , Omar Sandoval , Sagi Grimberg , Dave Chinner , Kent Overstreet , Mike Snitzer , dm-devel@redhat.com, Alexander Viro , linux-fsdevel@vger.kernel.org, Shaohua Li , linux-raid@vger.kernel.org, David Sterba , linux-btrfs@vger.kernel.org, "Darrick J . Wong" , linux-xfs@vger.kernel.org, Gao Xiang , Christoph Hellwig , linux-ext4@vger.kernel.org, Coly Li , linux-bcache@vger.kernel.org, Boaz Harrosh , Bob Peterson , cluster-devel@redhat.com Subject: Re: [PATCH V11 14/19] block: handle non-cluster bio out of blk_bio_segment_split Message-ID: <20181121143355.GB2594@lst.de> References: <20181121032327.8434-1-ming.lei@redhat.com> <20181121032327.8434-15-ming.lei@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181121032327.8434-15-ming.lei@redhat.com> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > + non-cluster.o Do we really need a new source file for these few functions? > default: > + if (!blk_queue_cluster(q)) { > + blk_queue_non_cluster_bio(q, bio); > + return; I'd name this blk_bio_segment_split_singlepage or similar. > +static __init int init_non_cluster_bioset(void) > +{ > + WARN_ON(bioset_init(&non_cluster_bio_set, BIO_POOL_SIZE, 0, > + BIOSET_NEED_BVECS)); > + WARN_ON(bioset_integrity_create(&non_cluster_bio_set, BIO_POOL_SIZE)); > + WARN_ON(bioset_init(&non_cluster_bio_split, BIO_POOL_SIZE, 0, 0)); Please only allocate the resources once a queue without the cluster flag is registered, there are only very few modern drivers that do that. > +static void non_cluster_end_io(struct bio *bio) > +{ > + struct bio *bio_orig = bio->bi_private; > + > + bio_orig->bi_status = bio->bi_status; > + bio_endio(bio_orig); > + bio_put(bio); > +} Why can't we use bio_chain for the split bios? > + bio_for_each_segment(from, *bio_orig, iter) { > + if (i++ < max_segs) > + sectors += from.bv_len >> 9; > + else > + break; > + } The easy to read way would be: bio_for_each_segment(from, *bio_orig, iter) { if (i++ == max_segs) break; sectors += from.bv_len >> 9; } > + if (sectors < bio_sectors(*bio_orig)) { > + bio = bio_split(*bio_orig, sectors, GFP_NOIO, > + &non_cluster_bio_split); > + bio_chain(bio, *bio_orig); > + generic_make_request(*bio_orig); > + *bio_orig = bio; I don't think this is very efficient, as this means we now clone the bio twice, first to split it at the sector boundary, and then again when converting it to single-page bio_vec. I think this could be something like this (totally untested): diff --git a/block/non-cluster.c b/block/non-cluster.c index 9c2910be9404..60389f275c43 100644 --- a/block/non-cluster.c +++ b/block/non-cluster.c @@ -13,58 +13,59 @@ #include "blk.h" -static struct bio_set non_cluster_bio_set, non_cluster_bio_split; +static struct bio_set non_cluster_bio_set; static __init int init_non_cluster_bioset(void) { WARN_ON(bioset_init(&non_cluster_bio_set, BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS)); WARN_ON(bioset_integrity_create(&non_cluster_bio_set, BIO_POOL_SIZE)); - WARN_ON(bioset_init(&non_cluster_bio_split, BIO_POOL_SIZE, 0, 0)); return 0; } __initcall(init_non_cluster_bioset); -static void non_cluster_end_io(struct bio *bio) -{ - struct bio *bio_orig = bio->bi_private; - - bio_orig->bi_status = bio->bi_status; - bio_endio(bio_orig); - bio_put(bio); -} - void blk_queue_non_cluster_bio(struct request_queue *q, struct bio **bio_orig) { - struct bio *bio; struct bvec_iter iter; - struct bio_vec from; - unsigned i = 0; - unsigned sectors = 0; - unsigned short max_segs = min_t(unsigned short, BIO_MAX_PAGES, - queue_max_segments(q)); + struct bio *bio; + struct bio_vec bv; + unsigned short max_segs, segs = 0; + + bio = bio_alloc_bioset(GFP_NOIO, bio_segments(*bio_orig), + &non_cluster_bio_set); + bio->bi_disk = (*bio_orig)->bi_disk; + bio->bi_partno = (*bio_orig)->bi_partno; + bio_set_flag(bio, BIO_CLONED); + if (bio_flagged(*bio_orig, BIO_THROTTLED)) + bio_set_flag(bio, BIO_THROTTLED); + bio->bi_opf = (*bio_orig)->bi_opf; + bio->bi_ioprio = (*bio_orig)->bi_ioprio; + bio->bi_write_hint = (*bio_orig)->bi_write_hint; + bio->bi_iter.bi_sector = (*bio_orig)->bi_iter.bi_sector; + bio->bi_iter.bi_size = (*bio_orig)->bi_iter.bi_size; + + if (bio_integrity(*bio_orig)) + bio_integrity_clone(bio, *bio_orig, GFP_NOIO); - bio_for_each_segment(from, *bio_orig, iter) { - if (i++ < max_segs) - sectors += from.bv_len >> 9; - else + bio_clone_blkcg_association(bio, *bio_orig); + + max_segs = min_t(unsigned short, queue_max_segments(q), BIO_MAX_PAGES); + bio_for_each_segment(bv, *bio_orig, iter) { + bio->bi_io_vec[segs++] = bv; + if (segs++ == max_segs) break; } - if (sectors < bio_sectors(*bio_orig)) { - bio = bio_split(*bio_orig, sectors, GFP_NOIO, - &non_cluster_bio_split); - bio_chain(bio, *bio_orig); - generic_make_request(*bio_orig); - *bio_orig = bio; - } - bio = bio_clone_bioset(*bio_orig, GFP_NOIO, &non_cluster_bio_set); + bio->bi_vcnt = segs; + bio->bi_phys_segments = segs; + bio_set_flag(bio, BIO_SEG_VALID); + bio_chain(bio, *bio_orig); - bio->bi_phys_segments = bio_segments(bio); - bio_set_flag(bio, BIO_SEG_VALID); - bio->bi_end_io = non_cluster_end_io; + if (bio_integrity(bio)) + bio_integrity_trim(bio); + bio_advance(bio, (*bio_orig)->bi_iter.bi_size); - bio->bi_private = *bio_orig; + generic_make_request(*bio_orig); *bio_orig = bio; }