Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3887769pxb; Tue, 26 Jan 2021 07:15:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJzfgPF2TEu8jlNc15gRNffUV0wdpRJ6Ao1P49DWK0aiFBHO9SHFFE0mr+L00+Tj+m40iFV4 X-Received: by 2002:aa7:d905:: with SMTP id a5mr5176476edr.78.1611674110455; Tue, 26 Jan 2021 07:15:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611674110; cv=none; d=google.com; s=arc-20160816; b=wli+5gYV9y2tNV6ZN10NKi2jH1MAlX7WAaIo7C0JkoeA4whS5yWFEjiluot8eC1zf5 n6wh3v9adfsY5monYKIAJjm+QW1h9ivI1GhIsCvlvvDe35z7qLy2dJP+sM2iVy9ARgy7 RmH7WcnVjoUewqCcwo9elWU7WArezGefveZR9ZlkJxj8nONz34tWwcBnZOC3FRzFMS7s 5B8MplHtE974LBk8tuA76QtvRRcmUyw4kOUNottpXZQZuvbx7ZkRUt+mOXw3gZUDTl5z wuq5IAKoC7nFEJ3X7HQHDZD8gRplcHgigFOYBcl4nZ8DjLKQGSg4gBjFAsPCcy2zTshG dmoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CdCfnYFFIv1ltUJZffW4u/NyTHX/9CqXY+v2ih8xgNs=; b=yyC1wa+UCyzF7fBOuhQNNQnUi4xjkWisKNheq0BlcZKGdx7qiP43FaAkQqgBPqRT8z cFH97OUdqJkM6FzlKd47IPQTFram2LLulf7E5J72nE3psOXd2Et1G1Y+q27g2/JtvSXh +HoM2q+FiGpPMk5oyRbKie22Znj6+BbJsYqusPiJtdmnIsWA1XEG1ZDd/RlR37CrCFva RXTViAXaj9zsRYaU47R8sSK2WbGehtOZ34J/YZK6bxizkovGaO6teOEvoZgP9s25uNHj 0Dx+izwJzlBrEzcW7UJMx7T+hje8mFc6Qp0fzdreHRe/J6m4Q9qV+EPZZfAMoQbSDvty 5mFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=O462VxC4; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ga8si7060722ejb.614.2021.01.26.07.14.41; Tue, 26 Jan 2021 07:15:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=O462VxC4; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392736AbhAZPFG (ORCPT + 99 others); Tue, 26 Jan 2021 10:05:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392696AbhAZPFB (ORCPT ); Tue, 26 Jan 2021 10:05:01 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D36BC061A31; Tue, 26 Jan 2021 07:04:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=CdCfnYFFIv1ltUJZffW4u/NyTHX/9CqXY+v2ih8xgNs=; b=O462VxC4MBXhmWGuB9OUKroqs4 typGBFqFMCyilTu7ML8uHujgoR9JF8SWDaddj4Ym9aoIznm4P/RmkkcdZipgiCdy4CU/UlVqyCg8+ sE1wXXgjIMYGNs0msjRYv9xvtlapQFy8jZVsy5C7CzVYfcpVdPRdw8u2BYpaD1xI3nvrIeQFywGSR 2MZh8zfGdDv+YvYVyz/uLytE6ZQd0SGX085MxoE2Rj5ABlvPXKnyYXnzTHwuLv5OUu9s0Yozu3Zcx cZljekJxqWStXaapkXgUZwyHP/gFeSzGvDBtSzy7qERfQsEUKoYCN1UZvCATMBsZSJ+VV/2zxe2jp kChbAVww==; Received: from [2001:4bb8:191:e347:5918:ac86:61cb:8801] (helo=localhost) by casper.infradead.org with esmtpsa (Exim 4.94 #2 (Red Hat Linux)) id 1l4PoW-005mZ0-Km; Tue, 26 Jan 2021 14:59:54 +0000 From: Christoph Hellwig To: Jens Axboe , Song Liu Cc: Jaegeuk Kim , Chao Yu , Philipp Reisner , Lars Ellenberg , Coly Li , Mike Snitzer , Josef Bacik , David Sterba , Damien Le Moal , Naohiro Aota , Andrew Morton , Ryusuke Konishi , linux-nilfs@vger.kernel.org, dm-devel@redhat.com, linux-f2fs-devel@lists.sourceforge.net, linux-block@vger.kernel.org, drbd-dev@lists.linbit.com, linux-bcache@vger.kernel.org, linux-raid@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 04/17] block: split bio_kmalloc from bio_alloc_bioset Date: Tue, 26 Jan 2021 15:52:34 +0100 Message-Id: <20210126145247.1964410-5-hch@lst.de> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210126145247.1964410-1-hch@lst.de> References: <20210126145247.1964410-1-hch@lst.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org bio_kmalloc shares almost no logic with the bio_set based fast path in bio_alloc_bioset. Split it into an entirely separate implementation. Signed-off-by: Christoph Hellwig --- block/bio.c | 167 ++++++++++++++++++++++---------------------- include/linux/bio.h | 6 +- 2 files changed, 86 insertions(+), 87 deletions(-) diff --git a/block/bio.c b/block/bio.c index dfd7740a32300a..d4375619348c52 100644 --- a/block/bio.c +++ b/block/bio.c @@ -396,123 +396,101 @@ static void punt_bios_to_rescuer(struct bio_set *bs) * @nr_iovecs: number of iovecs to pre-allocate * @bs: the bio_set to allocate from. * - * Description: - * If @bs is NULL, uses kmalloc() to allocate the bio; else the allocation is - * backed by the @bs's mempool. + * Allocate a bio from the mempools in @bs. * - * When @bs is not NULL, if %__GFP_DIRECT_RECLAIM is set then bio_alloc will - * always be able to allocate a bio. This is due to the mempool guarantees. - * To make this work, callers must never allocate more than 1 bio at a time - * from this pool. Callers that need to allocate more than 1 bio must always - * submit the previously allocated bio for IO before attempting to allocate - * a new one. Failure to do so can cause deadlocks under memory pressure. + * If %__GFP_DIRECT_RECLAIM is set then bio_alloc will always be able to + * allocate a bio. This is due to the mempool guarantees. To make this work, + * callers must never allocate more than 1 bio at a time from the general pool. + * Callers that need to allocate more than 1 bio must always submit the + * previously allocated bio for IO before attempting to allocate a new one. + * Failure to do so can cause deadlocks under memory pressure. * - * Note that when running under submit_bio_noacct() (i.e. any block - * driver), bios are not submitted until after you return - see the code in - * submit_bio_noacct() that converts recursion into iteration, to prevent - * stack overflows. + * Note that when running under submit_bio_noacct() (i.e. any block driver), + * bios are not submitted until after you return - see the code in + * submit_bio_noacct() that converts recursion into iteration, to prevent + * stack overflows. * - * This would normally mean allocating multiple bios under - * submit_bio_noacct() would be susceptible to deadlocks, but we have - * deadlock avoidance code that resubmits any blocked bios from a rescuer - * thread. + * This would normally mean allocating multiple bios under submit_bio_noacct() + * would be susceptible to deadlocks, but we have + * deadlock avoidance code that resubmits any blocked bios from a rescuer + * thread. * - * However, we do not guarantee forward progress for allocations from other - * mempools. Doing multiple allocations from the same mempool under - * submit_bio_noacct() should be avoided - instead, use bio_set's front_pad - * for per bio allocations. + * However, we do not guarantee forward progress for allocations from other + * mempools. Doing multiple allocations from the same mempool under + * submit_bio_noacct() should be avoided - instead, use bio_set's front_pad + * for per bio allocations. * - * RETURNS: - * Pointer to new bio on success, NULL on failure. + * Returns: Pointer to new bio on success, NULL on failure. */ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, struct bio_set *bs) { gfp_t saved_gfp = gfp_mask; - unsigned front_pad; - unsigned inline_vecs; - struct bio_vec *bvl = NULL; struct bio *bio; void *p; - if (!bs) { - if (nr_iovecs > UIO_MAXIOV) - return NULL; - - p = kmalloc(struct_size(bio, bi_inline_vecs, nr_iovecs), gfp_mask); - front_pad = 0; - inline_vecs = nr_iovecs; - } else { - /* should not use nobvec bioset for nr_iovecs > 0 */ - if (WARN_ON_ONCE(!mempool_initialized(&bs->bvec_pool) && - nr_iovecs > 0)) - return NULL; - /* - * submit_bio_noacct() converts recursion to iteration; this - * means if we're running beneath it, any bios we allocate and - * submit will not be submitted (and thus freed) until after we - * return. - * - * This exposes us to a potential deadlock if we allocate - * multiple bios from the same bio_set() while running - * underneath submit_bio_noacct(). If we were to allocate - * multiple bios (say a stacking block driver that was splitting - * bios), we would deadlock if we exhausted the mempool's - * reserve. - * - * We solve this, and guarantee forward progress, with a rescuer - * workqueue per bio_set. If we go to allocate and there are - * bios on current->bio_list, we first try the allocation - * without __GFP_DIRECT_RECLAIM; if that fails, we punt those - * bios we would be blocking to the rescuer workqueue before - * we retry with the original gfp_flags. - */ - - if (current->bio_list && - (!bio_list_empty(¤t->bio_list[0]) || - !bio_list_empty(¤t->bio_list[1])) && - bs->rescue_workqueue) - gfp_mask &= ~__GFP_DIRECT_RECLAIM; + /* should not use nobvec bioset for nr_iovecs > 0 */ + if (WARN_ON_ONCE(!mempool_initialized(&bs->bvec_pool) && nr_iovecs > 0)) + return NULL; + /* + * submit_bio_noacct() converts recursion to iteration; this means if + * we're running beneath it, any bios we allocate and submit will not be + * submitted (and thus freed) until after we return. + * + * This exposes us to a potential deadlock if we allocate multiple bios + * from the same bio_set() while running underneath submit_bio_noacct(). + * If we were to allocate multiple bios (say a stacking block driver + * that was splitting bios), we would deadlock if we exhausted the + * mempool's reserve. + * + * We solve this, and guarantee forward progress, with a rescuer + * workqueue per bio_set. If we go to allocate and there are bios on + * current->bio_list, we first try the allocation without + * __GFP_DIRECT_RECLAIM; if that fails, we punt those bios we would be + * blocking to the rescuer workqueue before we retry with the original + * gfp_flags. + */ + if (current->bio_list && + (!bio_list_empty(¤t->bio_list[0]) || + !bio_list_empty(¤t->bio_list[1])) && + bs->rescue_workqueue) + gfp_mask &= ~__GFP_DIRECT_RECLAIM; + + p = mempool_alloc(&bs->bio_pool, gfp_mask); + if (!p && gfp_mask != saved_gfp) { + punt_bios_to_rescuer(bs); + gfp_mask = saved_gfp; p = mempool_alloc(&bs->bio_pool, gfp_mask); - if (!p && gfp_mask != saved_gfp) { - punt_bios_to_rescuer(bs); - gfp_mask = saved_gfp; - p = mempool_alloc(&bs->bio_pool, gfp_mask); - } - - front_pad = bs->front_pad; - inline_vecs = BIO_INLINE_VECS; } - if (unlikely(!p)) return NULL; - bio = p + front_pad; - bio_init(bio, NULL, 0); - - if (nr_iovecs > inline_vecs) { + bio = p + bs->front_pad; + if (nr_iovecs > BIO_INLINE_VECS) { unsigned long idx = 0; + struct bio_vec *bvl = NULL; bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, &bs->bvec_pool); if (!bvl && gfp_mask != saved_gfp) { punt_bios_to_rescuer(bs); gfp_mask = saved_gfp; - bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, &bs->bvec_pool); + bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx, + &bs->bvec_pool); } if (unlikely(!bvl)) goto err_free; bio->bi_flags |= idx << BVEC_POOL_OFFSET; - bio->bi_max_vecs = bvec_nr_vecs(idx); + bio_init(bio, bvl, bvec_nr_vecs(idx)); } else if (nr_iovecs) { - bvl = bio->bi_inline_vecs; - bio->bi_max_vecs = inline_vecs; + bio_init(bio, bio->bi_inline_vecs, BIO_INLINE_VECS); + } else { + bio_init(bio, NULL, 0); } bio->bi_pool = bs; - bio->bi_io_vec = bvl; return bio; err_free: @@ -521,6 +499,31 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, } EXPORT_SYMBOL(bio_alloc_bioset); +/** + * bio_kmalloc - kmalloc a bio for I/O + * @gfp_mask: the GFP_* mask given to the slab allocator + * @nr_iovecs: number of iovecs to pre-allocate + * + * Use kmalloc to allocate and initialize a bio. + * + * Returns: Pointer to new bio on success, NULL on failure. + */ +struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs) +{ + struct bio *bio; + + if (nr_iovecs > UIO_MAXIOV) + return NULL; + + bio = kmalloc(struct_size(bio, bi_inline_vecs, nr_iovecs), gfp_mask); + if (unlikely(!bio)) + return NULL; + bio_init(bio, nr_iovecs ? bio->bi_inline_vecs : NULL, nr_iovecs); + bio->bi_pool = NULL; + return bio; +} +EXPORT_SYMBOL(bio_kmalloc); + void zero_fill_bio_iter(struct bio *bio, struct bvec_iter start) { unsigned long flags; diff --git a/include/linux/bio.h b/include/linux/bio.h index 676870b2c88d80..c74857cf12528c 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -408,6 +408,7 @@ extern int biovec_init_pool(mempool_t *pool, int pool_entries); extern int bioset_init_from_src(struct bio_set *bs, struct bio_set *src); extern struct bio *bio_alloc_bioset(gfp_t, unsigned int, struct bio_set *); +struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs); extern void bio_put(struct bio *); extern void __bio_clone_fast(struct bio *, struct bio *); @@ -420,11 +421,6 @@ static inline struct bio *bio_alloc(gfp_t gfp_mask, unsigned int nr_iovecs) return bio_alloc_bioset(gfp_mask, nr_iovecs, &fs_bio_set); } -static inline struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned int nr_iovecs) -{ - return bio_alloc_bioset(gfp_mask, nr_iovecs, NULL); -} - extern blk_qc_t submit_bio(struct bio *); extern void bio_endio(struct bio *); -- 2.29.2