Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756511AbdGCOfE (ORCPT ); Mon, 3 Jul 2017 10:35:04 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:35936 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755636AbdGCNmZ (ORCPT ); Mon, 3 Jul 2017 09:42:25 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Eduardo Valentin , Cristian Gafton , Anchal Agarwal , Vallish Vaidyeshwara , Joe Thornber , Mike Snitzer Subject: [PATCH 4.9 042/172] dm thin: do not queue freed thin mapping for next stage processing Date: Mon, 3 Jul 2017 15:33:42 +0200 Message-Id: <20170703133416.219045055@linuxfoundation.org> X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170703133414.260777365@linuxfoundation.org> References: <20170703133414.260777365@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4455 Lines: 122 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Vallish Vaidyeshwara commit 00a0ea33b495ee6149bf5a77ac5807ce87323abb upstream. process_prepared_discard_passdown_pt1() should cleanup dm_thin_new_mapping in cases of error. dm_pool_inc_data_range() can fail trying to get a block reference: metadata operation 'dm_pool_inc_data_range' failed: error = -61 When dm_pool_inc_data_range() fails, dm thin aborts current metadata transaction and marks pool as PM_READ_ONLY. Memory for thin mapping is released as well. However, current thin mapping will be queued onto next stage as part of queue_passdown_pt2() or passdown_endio(). This dangling thin mapping memory when processed and accessed in next stage will lead to device mapper crashing. Code flow without fix: -> process_prepared_discard_passdown_pt1(m) -> dm_thin_remove_range() -> discard passdown --> passdown_endio(m) queues m onto next stage -> dm_pool_inc_data_range() fails, frees memory m but does not remove it from next stage queue -> process_prepared_discard_passdown_pt2(m) -> processes freed memory m and crashes One such stack: Call Trace: [] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison] [] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool] [] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool] [] process_prepared+0x81/0xa0 [dm_thin_pool] [] do_worker+0xc5/0x820 [dm_thin_pool] [] ? __schedule+0x244/0x680 [] ? pwq_activate_delayed_work+0x42/0xb0 [] process_one_work+0x153/0x3f0 [] worker_thread+0x12b/0x4b0 [] ? rescuer_thread+0x350/0x350 [] kthread+0xca/0xe0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x25/0x30 The fix is to first take the block ref count for discarded block and then do a passdown discard of this block. If block ref count fails, then bail out aborting current metadata transaction, mark pool as PM_READ_ONLY and also free current thin mapping memory (existing error handling code) without queueing this thin mapping onto next stage of processing. If block ref count succeeds, then passdown discard of this block. Discard callback of passdown_endio() will queue this thin mapping onto next stage of processing. Code flow with fix: -> process_prepared_discard_passdown_pt1(m) -> dm_thin_remove_range() -> dm_pool_inc_data_range() --> if fails, free memory m and bail out -> discard passdown --> passdown_endio(m) queues m onto next stage Reviewed-by: Eduardo Valentin Reviewed-by: Cristian Gafton Reviewed-by: Anchal Agarwal Signed-off-by: Vallish Vaidyeshwara Reviewed-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Greg Kroah-Hartman --- drivers/md/dm-thin.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -1095,6 +1095,19 @@ static void process_prepared_discard_pas return; } + /* + * Increment the unmapped blocks. This prevents a race between the + * passdown io and reallocation of freed blocks. + */ + r = dm_pool_inc_data_range(pool->pmd, m->data_block, data_end); + if (r) { + metadata_operation_failed(pool, "dm_pool_inc_data_range", r); + bio_io_error(m->bio); + cell_defer_no_holder(tc, m->cell); + mempool_free(m, pool->mapping_pool); + return; + } + discard_parent = bio_alloc(GFP_NOIO, 1); if (!discard_parent) { DMWARN("%s: unable to allocate top level discard bio for passdown. Skipping passdown.", @@ -1115,19 +1128,6 @@ static void process_prepared_discard_pas end_discard(&op, r); } } - - /* - * Increment the unmapped blocks. This prevents a race between the - * passdown io and reallocation of freed blocks. - */ - r = dm_pool_inc_data_range(pool->pmd, m->data_block, data_end); - if (r) { - metadata_operation_failed(pool, "dm_pool_inc_data_range", r); - bio_io_error(m->bio); - cell_defer_no_holder(tc, m->cell); - mempool_free(m, pool->mapping_pool); - return; - } } static void process_prepared_discard_passdown_pt2(struct dm_thin_new_mapping *m)