Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4080632imu; Tue, 18 Dec 2018 08:45:28 -0800 (PST) X-Google-Smtp-Source: AFSGD/VITPYv7kueM9Cdjz7GqaOnw+T9poMtEIMlVm5m2qAzv5Gx6KrzFiRBXsPMiz7YVUjXm3AT X-Received: by 2002:a62:ca9c:: with SMTP id y28mr17110672pfk.236.1545151528774; Tue, 18 Dec 2018 08:45:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545151528; cv=none; d=google.com; s=arc-20160816; b=Cf2sNvpx0Yb5hVaYUWv5OYmxcv3/+eqT4DcwVFu7kGFRKvaS5/g7LbLNB3ugz6CoFn yH429N5I9lCdFP/afHGoQ66y4ZnT+zEYSlofKiOC3p4TFNVlOUVokIbPk8JoqcXc3c3k 6mVBaaZg5c84iyI4YW1vNhdcki3O7Irg87vSb8HSkG64yL2vvgpFwqt28FPyOMyBevie 6mbbST2cXpv4/OLr29w85a3MbOayHtwn+0asLr/svQuHj/Ulo3Ko0ahaIPLynKllSwD5 8nViKhhNKJdI1NSIFU17Xb0ShTYerpp/m8LG4HABjkOvPs0N3+HDUOqZlf6u8bFHNrgn cwhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=+ECwFXtKf/Ts1ONkyEV5r7OGqF2vUosn4e6RFAkXJfU=; b=UhUqd5YxCdrpHrgOxeW0tNFUDTw1CKgJwB8v/hjznj1u+l1mXHXsL36CPZ5dhnH8oA 8JeP9cbbtxonjonTqQILI/isYKn0ZmtYcc8Z3N1PcrpozTP7q9zN/spfBCpdcZA2ifHR 1PDgL73TQ4V552CrJfCM0D+BAwUu9vylctmGdDB3Tr9b8Ztp4BVlChRNfOa0KGLzNGvg lqaOKU/HI4x8DfeTA0J6cY7c2tTghRr6dF38HTmxg2CEK0WBVF1wn756tSUALgMa89/A uEFnwFS5Ys1fHRojflhXsxuuYSrb/YMH/3gYPIjtSt0/hd7GZ6glqKQnt2a0iTgHb4Ki /LYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=sTls1la8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p126si13326261pgp.529.2018.12.18.08.45.06; Tue, 18 Dec 2018 08:45:28 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=sTls1la8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727069AbeLRQmI (ORCPT + 99 others); Tue, 18 Dec 2018 11:42:08 -0500 Received: from mail.kernel.org ([198.145.29.99]:39764 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726422AbeLRQmG (ORCPT ); Tue, 18 Dec 2018 11:42:06 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7B5AF21873; Tue, 18 Dec 2018 16:42:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1545151325; bh=D2eSeEIjVeqaEc/zNGqENmuVPLlBSusdkM015X+uJxc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sTls1la8vBX2fxuBfUtLC26QoCgwyrlbCHMCzzI89VLewhAIccCI78//PSOPL2Wg3 Xu6Z6qkcLH92lB3J7R93ubmgP3HruU+yls3MveMf8f0Sjw4bqrFH1Q0b8EDPyQ/0RA EDXK7IHpANGpD+BxfRVXV/yr6XwJa49iFBf6g8aM= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Damien Le Moal , Mike Snitzer Subject: [PATCH 4.19 43/44] dm zoned: Fix target BIO completion handling Date: Tue, 18 Dec 2018 17:39:55 +0100 Message-Id: <20181218163932.444455036@linuxfoundation.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20181218163927.119623235@linuxfoundation.org> References: <20181218163927.119623235@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.19-stable review patch. If anyone has any objections, please let me know. ------------------ From: Damien Le Moal commit d57f9da890696af1484f4a47f7f123560197865a upstream. struct bioctx includes the ref refcount_t to track the number of I/O fragments used to process a target BIO as well as ensure that the zone of the BIO is kept in the active state throughout the lifetime of the BIO. However, since decrementing of this reference count is done in the target .end_io method, the function bio_endio() must be called multiple times for read and write target BIOs, which causes problems with the value of the __bi_remaining struct bio field for chained BIOs (e.g. the clone BIO passed by dm core is large and splits into fragments by the block layer), resulting in incorrect values and inconsistencies with the BIO_CHAIN flag setting. This is turn triggers the BUG_ON() call: BUG_ON(atomic_read(&bio->__bi_remaining) <= 0); in bio_remaining_done() called from bio_endio(). Fix this ensuring that bio_endio() is called only once for any target BIO by always using internal clone BIOs for processing any read or write target BIO. This allows reference counting using the target BIO context counter to trigger the target BIO completion bio_endio() call once all data, metadata and other zone work triggered by the BIO complete. Overall, this simplifies the code too as the target .end_io becomes unnecessary and differences between read and write BIO issuing and completion processing disappear. Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman --- drivers/md/dm-zoned-target.c | 122 +++++++++++++------------------------------ 1 file changed, 38 insertions(+), 84 deletions(-) --- a/drivers/md/dm-zoned-target.c +++ b/drivers/md/dm-zoned-target.c @@ -20,7 +20,6 @@ struct dmz_bioctx { struct dm_zone *zone; struct bio *bio; atomic_t ref; - blk_status_t status; }; /* @@ -78,65 +77,66 @@ static inline void dmz_bio_endio(struct { struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx)); - if (bioctx->status == BLK_STS_OK && status != BLK_STS_OK) - bioctx->status = status; - bio_endio(bio); + if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK) + bio->bi_status = status; + + if (atomic_dec_and_test(&bioctx->ref)) { + struct dm_zone *zone = bioctx->zone; + + if (zone) { + if (bio->bi_status != BLK_STS_OK && + bio_op(bio) == REQ_OP_WRITE && + dmz_is_seq(zone)) + set_bit(DMZ_SEQ_WRITE_ERR, &zone->flags); + dmz_deactivate_zone(zone); + } + bio_endio(bio); + } } /* - * Partial clone read BIO completion callback. This terminates the + * Completion callback for an internally cloned target BIO. This terminates the * target BIO when there are no more references to its context. */ -static void dmz_read_bio_end_io(struct bio *bio) +static void dmz_clone_endio(struct bio *clone) { - struct dmz_bioctx *bioctx = bio->bi_private; - blk_status_t status = bio->bi_status; + struct dmz_bioctx *bioctx = clone->bi_private; + blk_status_t status = clone->bi_status; - bio_put(bio); + bio_put(clone); dmz_bio_endio(bioctx->bio, status); } /* - * Issue a BIO to a zone. The BIO may only partially process the + * Issue a clone of a target BIO. The clone may only partially process the * original target BIO. */ -static int dmz_submit_read_bio(struct dmz_target *dmz, struct dm_zone *zone, - struct bio *bio, sector_t chunk_block, - unsigned int nr_blocks) +static int dmz_submit_bio(struct dmz_target *dmz, struct dm_zone *zone, + struct bio *bio, sector_t chunk_block, + unsigned int nr_blocks) { struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx)); - sector_t sector; struct bio *clone; - /* BIO remap sector */ - sector = dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block); - - /* If the read is not partial, there is no need to clone the BIO */ - if (nr_blocks == dmz_bio_blocks(bio)) { - /* Setup and submit the BIO */ - bio->bi_iter.bi_sector = sector; - atomic_inc(&bioctx->ref); - generic_make_request(bio); - return 0; - } - - /* Partial BIO: we need to clone the BIO */ clone = bio_clone_fast(bio, GFP_NOIO, &dmz->bio_set); if (!clone) return -ENOMEM; - /* Setup the clone */ - clone->bi_iter.bi_sector = sector; + bio_set_dev(clone, dmz->dev->bdev); + clone->bi_iter.bi_sector = + dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block); clone->bi_iter.bi_size = dmz_blk2sect(nr_blocks) << SECTOR_SHIFT; - clone->bi_end_io = dmz_read_bio_end_io; + clone->bi_end_io = dmz_clone_endio; clone->bi_private = bioctx; bio_advance(bio, clone->bi_iter.bi_size); - /* Submit the clone */ atomic_inc(&bioctx->ref); generic_make_request(clone); + if (bio_op(bio) == REQ_OP_WRITE && dmz_is_seq(zone)) + zone->wp_block += nr_blocks; + return 0; } @@ -214,7 +214,7 @@ static int dmz_handle_read(struct dmz_ta if (nr_blocks) { /* Valid blocks found: read them */ nr_blocks = min_t(unsigned int, nr_blocks, end_block - chunk_block); - ret = dmz_submit_read_bio(dmz, rzone, bio, chunk_block, nr_blocks); + ret = dmz_submit_bio(dmz, rzone, bio, chunk_block, nr_blocks); if (ret) return ret; chunk_block += nr_blocks; @@ -229,25 +229,6 @@ static int dmz_handle_read(struct dmz_ta } /* - * Issue a write BIO to a zone. - */ -static void dmz_submit_write_bio(struct dmz_target *dmz, struct dm_zone *zone, - struct bio *bio, sector_t chunk_block, - unsigned int nr_blocks) -{ - struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx)); - - /* Setup and submit the BIO */ - bio_set_dev(bio, dmz->dev->bdev); - bio->bi_iter.bi_sector = dmz_start_sect(dmz->metadata, zone) + dmz_blk2sect(chunk_block); - atomic_inc(&bioctx->ref); - generic_make_request(bio); - - if (dmz_is_seq(zone)) - zone->wp_block += nr_blocks; -} - -/* * Write blocks directly in a data zone, at the write pointer. * If a buffer zone is assigned, invalidate the blocks written * in place. @@ -265,7 +246,9 @@ static int dmz_handle_direct_write(struc return -EROFS; /* Submit write */ - dmz_submit_write_bio(dmz, zone, bio, chunk_block, nr_blocks); + ret = dmz_submit_bio(dmz, zone, bio, chunk_block, nr_blocks); + if (ret) + return ret; /* * Validate the blocks in the data zone and invalidate @@ -301,7 +284,9 @@ static int dmz_handle_buffered_write(str return -EROFS; /* Submit write */ - dmz_submit_write_bio(dmz, bzone, bio, chunk_block, nr_blocks); + ret = dmz_submit_bio(dmz, bzone, bio, chunk_block, nr_blocks); + if (ret) + return ret; /* * Validate the blocks in the buffer zone @@ -600,7 +585,6 @@ static int dmz_map(struct dm_target *ti, bioctx->zone = NULL; bioctx->bio = bio; atomic_set(&bioctx->ref, 1); - bioctx->status = BLK_STS_OK; /* Set the BIO pending in the flush list */ if (!nr_sectors && bio_op(bio) == REQ_OP_WRITE) { @@ -624,35 +608,6 @@ static int dmz_map(struct dm_target *ti, } /* - * Completed target BIO processing. - */ -static int dmz_end_io(struct dm_target *ti, struct bio *bio, blk_status_t *error) -{ - struct dmz_bioctx *bioctx = dm_per_bio_data(bio, sizeof(struct dmz_bioctx)); - - if (bioctx->status == BLK_STS_OK && *error) - bioctx->status = *error; - - if (!atomic_dec_and_test(&bioctx->ref)) - return DM_ENDIO_INCOMPLETE; - - /* Done */ - bio->bi_status = bioctx->status; - - if (bioctx->zone) { - struct dm_zone *zone = bioctx->zone; - - if (*error && bio_op(bio) == REQ_OP_WRITE) { - if (dmz_is_seq(zone)) - set_bit(DMZ_SEQ_WRITE_ERR, &zone->flags); - } - dmz_deactivate_zone(zone); - } - - return DM_ENDIO_DONE; -} - -/* * Get zoned device information. */ static int dmz_get_zoned_device(struct dm_target *ti, char *path) @@ -947,7 +902,6 @@ static struct target_type dmz_type = { .ctr = dmz_ctr, .dtr = dmz_dtr, .map = dmz_map, - .end_io = dmz_end_io, .io_hints = dmz_io_hints, .prepare_ioctl = dmz_prepare_ioctl, .postsuspend = dmz_suspend,