Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752206AbbERNOO (ORCPT ); Mon, 18 May 2015 09:14:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60747 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750802AbbERNOH (ORCPT ); Mon, 18 May 2015 09:14:07 -0400 Date: Mon, 18 May 2015 09:13:59 -0400 From: Mike Snitzer To: Jan Kara Cc: dm-devel@redhat.com, Jens Axboe , linux-kernel@vger.kernel.org, Chris Mason Subject: Re: [PATCH for-4.2 01/14] block: remove management of bi_remaining when restoring original bi_end_io Message-ID: <20150518131358.GA13998@redhat.com> References: <1431637512-64245-1-git-send-email-snitzer@redhat.com> <1431637512-64245-2-git-send-email-snitzer@redhat.com> <20150518072237.GE26351@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150518072237.GE26351@quack.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3758 Lines: 70 On Mon, May 18 2015 at 3:22am -0400, Jan Kara wrote: > On Thu 14-05-15 17:04:59, Mike Snitzer wrote: > > Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for > > non-chains") regressed all existing callers that followed this pattern: > > 1) saving a bio's original bi_end_io > > 2) wiring up an intermediate bi_end_io > > 3) restoring the original bi_end_io from intermediate bi_end_io > > 4) calling bio_endio() to execute the restored original bi_end_io > > > > The regression was due to BIO_CHAIN only ever getting set if > > bio_inc_remaining() is called. For the above pattern it isn't set until > > step 3 above (step 2 would've needed to establish BIO_CHAIN). As such > > the first bio_endio(), in step 2 above, never decremented __bi_remaining > > before calling the intermediate bi_end_io -- leaving __bi_remaining with > > the value 1 instead of 0. When bio_inc_remaining() occurred during step > > 3 it brought it to a value of 2. When the second bio_endio() was > > called, in step 4 above, it should've called the original bi_end_io but > > it didn't because there was an extra reference that wasn't dropped (due > > to atomic operations being optimized away since BIO_CHAIN wasn't set > > upfront). > > > > Fix this issue by removing the __bi_remaining management complexity for > > all callers that use the above pattern -- bio_chain() is the only > > interface that _needs_ to be concerned with __bi_remaining. For the > > above pattern callers just expect the bi_end_io they set to get called! > > Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls > > that aren't associated with the bio_chain() interface. > > > > The bio_inc_remaining() interface has been left exported because it is > > still useful for more elaborate uses of bio_chain() -- it will be used > > in an upcoming DM commit "dm thin: range discard support". > > > > Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains") > > Signed-off-by: Mike Snitzer > > Cc: Jan Kara > > Cc: Chris Mason > One question: What happens if you stack dm-thin on top of e.g. dm-linear? > dm-thin will do it's thing to a bio and passes it to dm-linear. That will > split & chain the bio so BIO_CHAIN will be set. And on IO completion you > will have troubles in dm-thinp as now bi_remaining gets decremented in > bio_endio(). That's the reason why I suggested that we should clear > BIO_CHAIN once bi_remaining hits zero... I think you need to be more precise in explaining the scenario you're concerned about. Could be there is an issue but I'm not seeing it yet. Are you referring to the patch that makes DM thinp use the proposed blkdev_issue_discard_async() interface? The bios issued to DM linear are generated by blkdev_issue_discard_async(). By using bio_chain() they establish ancestory with the parent DM thinp bio (which has had BIO_CHAIN set even before calling blkdev_issue_discard_async because there is potential for DM thinp to complete the parent bio before all N blkdev_issue_discard_async() generated bios complete -- so that is why DM thinp itself takes an extra reference on the parent bio using bio_inc_remaining() before calling blkdev_issue_discard_async) In practice I'm not having any problems with the device-mapper-test-suite's thin-provisioning tests, e.g.: dmtest run --suite thin-provisioning -n /discard/ (dmts happily uses dm-linear devices and stacks DM thinp ontop.) Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/