Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758425Ab3CYPfm (ORCPT ); Mon, 25 Mar 2013 11:35:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50928 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757479Ab3CYPfl (ORCPT ); Mon, 25 Mar 2013 11:35:41 -0400 Message-ID: <51506EBA.8060708@redhat.com> Date: Mon, 25 Mar 2013 16:35:22 +0100 From: Jan Vesely User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 MIME-Version: 1.0 To: Jens Axboe CC: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , fujita.tomonori@lab.ntt.co.jp, =?UTF-8?B?S2FpIE3DpGtpc2FyYQ==?= , James Bottomley Subject: Re: [PATCH RESEND v2] block: modify __bio_add_page check to accept pages that don't start a new segment References: <51505AC1.60809@redhat.com> <20130325142457.GD5401@kernel.dk> In-Reply-To: <20130325142457.GD5401@kernel.dk> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4591 Lines: 129 On Mon 25 Mar 2013 15:24:57 CET, Jens Axboe wrote: > On Mon, Mar 25 2013, Jan Vesely wrote: >> v2: changed a comment >> >> The original behavior was to refuse all pages after the maximum number of >> segments has been reached. However, some drivers (like st) craft their buffers >> to potentially require exactly max segments and multiple pages in the last >> segment. This patch modifies the check to allow pages that can be merged into >> the last segment. >> >> Fixes EBUSY failures when using large tape block size in high >> memory fragmentation condition. >> This regression was introduced by commit >> 46081b166415acb66d4b3150ecefcd9460bb48a1 >> st: Increase success probability in driver buffer allocation >> >> Signed-off-by: Jan Vesely >> >> CC: Alexander Viro >> CC: FUJITA Tomonori >> CC: Kai Makisara >> CC: James Bottomley >> CC: Jens Axboe >> CC: stable@vger.kernel.org >> --- >> fs/bio.c | 27 +++++++++++++++++---------- >> 1 file changed, 17 insertions(+), 10 deletions(-) >> >> diff --git a/fs/bio.c b/fs/bio.c >> index bb5768f..bc6af71 100644 >> --- a/fs/bio.c >> +++ b/fs/bio.c >> @@ -500,7 +500,6 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page >> *page, unsigned int len, unsigned int offset, >> unsigned short max_sectors) >> { >> - int retried_segments = 0; >> struct bio_vec *bvec; >> >> /* >> @@ -551,18 +550,13 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page >> return 0; >> >> /* >> - * we might lose a segment or two here, but rather that than >> - * make this too complex. >> + * The first part of the segment count check, >> + * reduce segment count if possible >> */ >> >> - while (bio->bi_phys_segments >= queue_max_segments(q)) { >> - >> - if (retried_segments) >> - return 0; >> - >> - retried_segments = 1; >> + if (bio->bi_phys_segments >= queue_max_segments(q)) >> blk_recount_segments(q, bio); >> - } >> + >> >> /* >> * setup the new entry, we might clear it again later if we >> @@ -572,6 +566,19 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page >> bvec->bv_page = page; >> bvec->bv_len = len; >> bvec->bv_offset = offset; >> + >> + /* >> + * the other part of the segment count check, allow mergeable pages >> + */ >> + if ((bio->bi_phys_segments > queue_max_segments(q)) || >> + ( (bio->bi_phys_segments == queue_max_segments(q)) && >> + !BIOVEC_PHYS_MERGEABLE(bvec - 1, bvec))) { >> + bvec->bv_page = NULL; >> + bvec->bv_len = 0; >> + bvec->bv_offset = 0; >> + return 0; >> + } >> + > > This is a bit messy, I think. bi_phys_segments should never be allowed > to go beyond queue_ma_segments(), so the > test does not look right. > Maybe it's an artifact of when we fall through with this patch, we bump > bi_phys_segments even if the segments are physicall contig and > mergeable. yeah. it is messy, I tried to go for the least invasive changes. I took the '>' test from the original while loop '>='. The original behavior guaranteed bio->bi_phys_segments <= max_segments, if the bio satisfied this condition to begin with. I did not find any guarantees that the 'bio' parameter of this function has to satisfy this condition in general. My understanding is that if a caller of this function (or one of the two that call this one) provides an invalid (segment-count-wise) bio, it will fail (return 0 added length), and let the caller handle the situation. I admit, I did not check all the call paths that use these functions. > > What happens when the segment is physically mergeable, but the resulting > merged segment is too large (bigger than q->limits.max_segment_size)? > ah, yes. I guess I need a check that follows __blk_recalc_rq_segments more closely. We know that at this point all pages are merged into segments, so a helper function that would be used by both __blk_recalc_rq_segments and this check is possible. I still assume that a temporary increase of bi_phys_segments above max_segments is ok. If we want to avoid this situation we would need to merge tail pages right away. That's imo uglier. thanks -- Jan Vesely -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/