Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933461AbdCHA4J (ORCPT ); Tue, 7 Mar 2017 19:56:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35186 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933423AbdCHA4H (ORCPT ); Tue, 7 Mar 2017 19:56:07 -0500 Date: Tue, 7 Mar 2017 18:01:04 -0500 From: Mike Snitzer To: NeilBrown Cc: Jens Axboe , Jack Wang , LKML , Lars Ellenberg , Kent Overstreet , Pavel Machek , Mikulas Patocka Subject: Re: blk: improve order of bio handling in generic_make_request() Message-ID: <20170307230104.GA3671@redhat.com> References: <87h93blz6g.fsf@notabene.neil.brown.name> <71562c2c-97f4-9a0a-32ec-30e0702ca575@profitbricks.com> <87lgsjj9w8.fsf@notabene.neil.brown.name> <20170307165233.GB30230@redhat.com> <5cfbdc6b-9ba7-605a-642b-7f625cf5f5b7@kernel.dk> <20170307171436.GA2109@redhat.com> <87tw74j0e4.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87tw74j0e4.fsf@notabene.neil.brown.name> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 07 Mar 2017 23:01:08 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1944 Lines: 54 On Tue, Mar 07 2017 at 3:29pm -0500, NeilBrown wrote: > On Tue, Mar 07 2017, Mike Snitzer wrote: > > > On Tue, Mar 07 2017 at 12:05pm -0500, > > Jens Axboe wrote: > > > >> On 03/07/2017 09:52 AM, Mike Snitzer wrote: > >> > > >> > In addition to Jack's MD raid test there is a DM snapshot deadlock test, > >> > albeit unpolished/needy to get running, see: > >> > https://www.redhat.com/archives/dm-devel/2017-January/msg00064.html > >> > >> Can you run this patch with that test, reverting your DM workaround? > > > > Yeap, will do. Last time Mikulas tried a similar patch it still > > deadlocked. But I'll give it a go (likely tomorrow). > > I don't think this will fix the DM snapshot deadlock by itself. > Rather, it make it possible for some internal changes to DM to fix it. > The DM change might be something vaguely like: > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 3086da5664f3..06ee0960e415 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1216,6 +1216,14 @@ static int __split_and_process_non_flush(struct clone_info *ci) > > len = min_t(sector_t, max_io_len(ci->sector, ti), ci->sector_count); > > + if (len < ci->sector_count) { > + struct bio *split = bio_split(bio, len, GFP_NOIO, fs_bio_set); > + bio_chain(split, bio); > + generic_make_request(bio); > + bio = split; > + ci->sector_count = len; > + } > + > r = __clone_and_map_data_bio(ci, ti, ci->sector, &len); > if (r < 0) > return r; > > Instead of looping inside DM, this change causes the remainder to be > passed to generic_make_request() and DM only handles or region at a > time. So there is only one loop, in the top generic_make_request(). > That loop will not reliable handle bios in the "right" order. s/not reliable/now reliably/ ? ;) But thanks for the suggestion Neil. Will dig in once I get through a backlog of other DM target code I have queued for 4.12 review. Mike