Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2364261imu; Wed, 21 Nov 2018 10:25:57 -0800 (PST) X-Google-Smtp-Source: AFSGD/XDkozbCM9DVdVjhoiQ5hN8QH8ukYRQLi+ozL9nsxb/1aGdUSyO6Qt87txztMfObysWF06q X-Received: by 2002:a63:e302:: with SMTP id f2mr7135046pgh.320.1542824757007; Wed, 21 Nov 2018 10:25:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542824756; cv=none; d=google.com; s=arc-20160816; b=x2/aXH2gWjZI5xnMBI8qtzXiz2NIBeqvpK6fw9bCNkewKZ4ufWuracVq/2+R7denBQ cIA5Q1eYiUoWRUzkBIBOjxvZfCgjtFJqD6G4eSK6GtymtiOZZ8ECDh9PqgxKRMWkHn5c oxFTT+fOi2gL/vmwLAMstBPbAb2CqrT6hk8Nz/sTng50ruNMAFyxpE4fIvGa23cQfGjs 7SrefzRuUlpIM+vtR2r/ZexpIwls6GIYGLFxitSx319DGQKZ86T6SxFQK4jau58c2I9c p1mit6meXlDS1PV+KC2dknqLksnOyXycyrGk9SqbUAP4vKhvqdd1ixZ7nNgA13J7/xrx 2Vdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=qZeXfEGZW0OyV++KYi9sjT7/mrQKjXxQMjCgKK/XGx4=; b=U13wt9WHSBATE25pgz8efhtpf5oQnCKkMiDoo88nAkUwwumshDjPzb6SGGFfGeRxV8 LzTsQSQuNRzr5pDCwA2VHiwUi60CJvoQno/rlSdHzjLYMQ3+/2xNuMj6XJGIFSgy2SuX 58fHk+JLX3vDgZvZM/woiv+gmEI2B1HypvLQpmhlQgoLeFt5Z0IIXEjaOdfSuwgkvWQE IBWKuFNw2ZeDgmBOdm9Gs+S7g3f0VToUQI3WchQsSHivpzD8a+hbSBtE8U/ewHqIBeh8 JoP5OFeBBbtSZWgOJEuDn3MTxRzYdCI8P/OrzLfAFmlN0Dhsu8TnCzrsVbpi1BrN3YW6 TEhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i29si24920924pgb.550.2018.11.21.10.25.38; Wed, 21 Nov 2018 10:25:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731458AbeKVCMv (ORCPT + 99 others); Wed, 21 Nov 2018 21:12:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38888 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730072AbeKVCMv (ORCPT ); Wed, 21 Nov 2018 21:12:51 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 87EDC3084297; Wed, 21 Nov 2018 15:37:54 +0000 (UTC) Received: from ming.t460p (ovpn-8-20.pek2.redhat.com [10.72.8.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 49A825C21E; Wed, 21 Nov 2018 15:37:31 +0000 (UTC) Date: Wed, 21 Nov 2018 23:37:27 +0800 From: Ming Lei To: Christoph Hellwig Cc: Jens Axboe , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Theodore Ts'o , Omar Sandoval , Sagi Grimberg , Dave Chinner , Kent Overstreet , Mike Snitzer , dm-devel@redhat.com, Alexander Viro , linux-fsdevel@vger.kernel.org, Shaohua Li , linux-raid@vger.kernel.org, David Sterba , linux-btrfs@vger.kernel.org, "Darrick J . Wong" , linux-xfs@vger.kernel.org, Gao Xiang , linux-ext4@vger.kernel.org, Coly Li , linux-bcache@vger.kernel.org, Boaz Harrosh , Bob Peterson , cluster-devel@redhat.com Subject: Re: [PATCH V11 14/19] block: handle non-cluster bio out of blk_bio_segment_split Message-ID: <20181121153726.GC19111@ming.t460p> References: <20181121032327.8434-1-ming.lei@redhat.com> <20181121032327.8434-15-ming.lei@redhat.com> <20181121143355.GB2594@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181121143355.GB2594@lst.de> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Wed, 21 Nov 2018 15:37:55 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 21, 2018 at 03:33:55PM +0100, Christoph Hellwig wrote: > > + non-cluster.o > > Do we really need a new source file for these few functions? > > > default: > > + if (!blk_queue_cluster(q)) { > > + blk_queue_non_cluster_bio(q, bio); > > + return; > > I'd name this blk_bio_segment_split_singlepage or similar. OK. > > > +static __init int init_non_cluster_bioset(void) > > +{ > > + WARN_ON(bioset_init(&non_cluster_bio_set, BIO_POOL_SIZE, 0, > > + BIOSET_NEED_BVECS)); > > + WARN_ON(bioset_integrity_create(&non_cluster_bio_set, BIO_POOL_SIZE)); > > + WARN_ON(bioset_init(&non_cluster_bio_split, BIO_POOL_SIZE, 0, 0)); > > Please only allocate the resources once a queue without the cluster > flag is registered, there are only very few modern drivers that do that. OK. > > > +static void non_cluster_end_io(struct bio *bio) > > +{ > > + struct bio *bio_orig = bio->bi_private; > > + > > + bio_orig->bi_status = bio->bi_status; > > + bio_endio(bio_orig); > > + bio_put(bio); > > +} > > Why can't we use bio_chain for the split bios? The parent bio is multi-page bvec, we can't submit it for non-cluster. > > > + bio_for_each_segment(from, *bio_orig, iter) { > > + if (i++ < max_segs) > > + sectors += from.bv_len >> 9; > > + else > > + break; > > + } > > The easy to read way would be: > > bio_for_each_segment(from, *bio_orig, iter) { > if (i++ == max_segs) > break; > sectors += from.bv_len >> 9; > } OK. > > > + if (sectors < bio_sectors(*bio_orig)) { > > + bio = bio_split(*bio_orig, sectors, GFP_NOIO, > > + &non_cluster_bio_split); > > + bio_chain(bio, *bio_orig); > > + generic_make_request(*bio_orig); > > + *bio_orig = bio; > > I don't think this is very efficient, as this means we now > clone the bio twice, first to split it at the sector boundary, > and then again when converting it to single-page bio_vec. That is exactly what bounce code does. The problem for both bounce and non-cluster is same actually because the bvec table itself has to be changed. > > I think this could be something like this (totally untested): > > diff --git a/block/non-cluster.c b/block/non-cluster.c > index 9c2910be9404..60389f275c43 100644 > --- a/block/non-cluster.c > +++ b/block/non-cluster.c > @@ -13,58 +13,59 @@ > > #include "blk.h" > > -static struct bio_set non_cluster_bio_set, non_cluster_bio_split; > +static struct bio_set non_cluster_bio_set; > > static __init int init_non_cluster_bioset(void) > { > WARN_ON(bioset_init(&non_cluster_bio_set, BIO_POOL_SIZE, 0, > BIOSET_NEED_BVECS)); > WARN_ON(bioset_integrity_create(&non_cluster_bio_set, BIO_POOL_SIZE)); > - WARN_ON(bioset_init(&non_cluster_bio_split, BIO_POOL_SIZE, 0, 0)); > > return 0; > } > __initcall(init_non_cluster_bioset); > > -static void non_cluster_end_io(struct bio *bio) > -{ > - struct bio *bio_orig = bio->bi_private; > - > - bio_orig->bi_status = bio->bi_status; > - bio_endio(bio_orig); > - bio_put(bio); > -} > - > void blk_queue_non_cluster_bio(struct request_queue *q, struct bio **bio_orig) > { > - struct bio *bio; > struct bvec_iter iter; > - struct bio_vec from; > - unsigned i = 0; > - unsigned sectors = 0; > - unsigned short max_segs = min_t(unsigned short, BIO_MAX_PAGES, > - queue_max_segments(q)); > + struct bio *bio; > + struct bio_vec bv; > + unsigned short max_segs, segs = 0; > + > + bio = bio_alloc_bioset(GFP_NOIO, bio_segments(*bio_orig), > + &non_cluster_bio_set); bio_segments(*bio_orig) may be > 256, so bio_alloc_bioset() may fail. Thanks, Ming