Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933025AbdCJSvS (ORCPT ); Fri, 10 Mar 2017 13:51:18 -0500 Received: from mail-wm0-f42.google.com ([74.125.82.42]:36633 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755427AbdCJSvH (ORCPT ); Fri, 10 Mar 2017 13:51:07 -0500 Date: Fri, 10 Mar 2017 19:51:03 +0100 From: Lars Ellenberg To: Jack Wang Cc: Mikulas Patocka , Mike Snitzer , NeilBrown , Jens Axboe , LKML , Kent Overstreet , Pavel Machek , linux-raid@vger.kernel.org, device-mapper development , linux-block@vger.kernel.org Subject: Re: [PATCH v2] blk: improve order of bio handling in generic_make_request() Message-ID: <20170310185103.GA15851@soda.linbit> References: <87lgsjj9w8.fsf@notabene.neil.brown.name> <87r328j00i.fsf@notabene.neil.brown.name> <87d1dphhuy.fsf@notabene.neil.brown.name> <58be6551-4aa7-72ee-1616-a1545606d029@kernel.dk> <87varhg14d.fsf@notabene.neil.brown.name> <20170310143822.GA23879@redhat.com> <153a6cff-c553-0d18-e15b-4f3defc3a42b@profitbricks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <153a6cff-c553-0d18-e15b-4f3defc3a42b@profitbricks.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2750 Lines: 77 On Fri, Mar 10, 2017 at 04:07:58PM +0100, Jack Wang wrote: > On 10.03.2017 15:55, Mikulas Patocka wrote: > > On Fri, 10 Mar 2017, Mike Snitzer wrote: > >> On Fri, Mar 10 2017 at 7:34am -0500, > >> Lars Ellenberg wrote: > >> > >>>> --- a/block/blk-core.c > >>>> +++ b/block/blk-core.c > >>>> @@ -1975,7 +1975,14 @@ generic_make_request_checks(struct bio *bio) > >>>> */ > >>>> blk_qc_t generic_make_request(struct bio *bio) > >>>> { > >>>> - struct bio_list bio_list_on_stack; > >>>> + /* > >>>> + * bio_list_on_stack[0] contains bios submitted by the current > >>>> + * make_request_fn. > >>>> + * bio_list_on_stack[1] contains bios that were submitted before > >>>> + * the current make_request_fn, but that haven't been processed > >>>> + * yet. > >>>> + */ > >>>> + struct bio_list bio_list_on_stack[2]; > >>>> blk_qc_t ret = BLK_QC_T_NONE; > >>> > >>> May I suggest that, if you intend to assign something that is not a > >>> plain &(struct bio_list), but a &(struct bio_list[2]), > >>> you change the task member so it is renamed (current->bio_list vs > >>> current->bio_lists, plural, is what I did last year). > >>> Or you will break external modules, silently, and horribly (or, > >>> rather, they won't notice, but break the kernel). > >>> Examples of such modules would be DRBD, ZFS, quite possibly others. > > It's better to make external modules not compile than to silently > > introduce bugs in them. So yes, I would rename that. > > > > Mikulas > > Agree, better rename current->bio_list to current->bio_lists > > Regards, > Jack Thank you. (I don't know if some one does, but...) Thing is: *IF* some external thing messes with current->bio_list in "interesting" ways, and not just the "I don't care, one level of real recursion fixes this for me" pattern of struct bio_list *tmp = current->bio_list; current->bio_list = NULL; submit_bio() current->bio_list = tmp; you get a chance of stack corruption, without even as much as a compiler warning. Which is why I wrote https://lkml.org/lkml/2016/7/8/469 ... Instead, I suggest to distinguish between recursive calls to generic_make_request(), and pushing back the remainder part in blk_queue_split(), by pointing current->bio_lists to a struct recursion_to_iteration_bio_lists { struct bio_list recursion; struct bio_list queue; } By providing each q->make_request_fn() with an empty "recursion" bio_list, then merging any recursively submitted bios to the head of the "queue" list, we can make the recursion-to-iteration logic in generic_make_request() process deepest level bios first, and "sibling" bios of the same level in "natural" order. ... Cheers, Lars