Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756484AbYK3Xmc (ORCPT ); Sun, 30 Nov 2008 18:42:32 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754954AbYK3XmX (ORCPT ); Sun, 30 Nov 2008 18:42:23 -0500 Received: from mx2.redhat.com ([66.187.237.31]:40477 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754836AbYK3XmW (ORCPT ); Sun, 30 Nov 2008 18:42:22 -0500 Message-ID: <493324D1.7010202@redhat.com> Date: Mon, 01 Dec 2008 00:42:09 +0100 From: Milan Broz User-Agent: Thunderbird 2.0.0.17 (X11/20080914) MIME-Version: 1.0 To: Jens Axboe CC: Alasdair G Kergon , Neil Brown , Linux Kernel Mailing List Subject: [PATCH] block: fix setting of max_segment_size and seg_boundary mask X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5138 Lines: 132 Fix setting of max_segment_size and seg_boundary mask for stacked md/dm devices. When stacking devices (LVM over MD over SCSI) some of the request queue parameters are not set up correctly in some cases by default, namely max_segment_size and and seg_boundary mask. If you create MD device over SCSI, these attributes are zeroed. Problem become when there is over this mapping next device-mapper mapping - queue attributes are set in DM this way: request_queue max_segment_size seg_boundary_mask SCSI 65536 0xffffffff MD RAID1 0 0 LVM 65536 -1 (64bit) Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of physical segments according to these parameters. During the generic_make_request() is segment cout recalculated and can increase bio->bi_phys_segments count over the allowed limit. (After bio_clone() in stack operation.) Thi is specially problem in CCISS driver, where it produce OOPS here BUG_ON(creq->nr_phys_segments > MAXSGENTRIES); (MAXSEGENTRIES is 31 by default.) Sometimes even this command is enough to cause oops: dd iflag=direct if=/dev// of=/dev/null bs=128000 count=10 This command generates bios with 250 sectors, allocated in 32 4k-pages (last page uses only 1024 bytes). For LVM layer, it allocates bio with 31 segments (still OK for CCISS), unfortunatelly on lower layer it is recalculated to 32 segments and this violates CCISS restriction and triggers BUG_ON(). Attached patch tries to fix it by: * initializing attributes above in queue request constructor blk_queue_make_request() * make sure that blk_queue_stack_limits() inherits setting (DM uses its own function to set the limits because it blk_queue_stack_limits() was introduced later. It should probably switch to use generic stack limit function too.) * sets the default seg_boundary value in one place (blkdev.h) * use this mask as default in DM (instead of -1, which differs in 64bit) Bugs related to this: https://bugzilla.redhat.com/show_bug.cgi?id=471639 http://bugzilla.kernel.org/show_bug.cgi?id=8672 Signed-off-by: Milan Broz --- block/blk-core.c | 2 +- block/blk-settings.c | 4 ++++ drivers/md/dm-table.c | 2 +- include/linux/blkdev.h | 2 ++ 4 files changed, 8 insertions(+), 2 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 10e8a64..b7d646c 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -592,7 +592,7 @@ blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id) 1 << QUEUE_FLAG_STACKABLE); q->queue_lock = lock; - blk_queue_segment_boundary(q, 0xffffffff); + blk_queue_segment_boundary(q, BLK_SEG_BOUNDARY_MASK); blk_queue_make_request(q, __make_request); blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE); diff --git a/block/blk-settings.c b/block/blk-settings.c index 41392fb..afa55e1 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -125,6 +125,9 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) q->nr_requests = BLKDEV_MAX_RQ; blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS); blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS); + blk_queue_segment_boundary(q, BLK_SEG_BOUNDARY_MASK); + blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE); + q->make_request_fn = mfn; q->backing_dev_info.ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE; @@ -314,6 +317,7 @@ void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b) /* zero is "infinity" */ t->max_sectors = min_not_zero(t->max_sectors, b->max_sectors); t->max_hw_sectors = min_not_zero(t->max_hw_sectors, b->max_hw_sectors); + t->seg_boundary_mask = min_not_zero(t->seg_boundary_mask, b->seg_boundary_mask); t->max_phys_segments = min(t->max_phys_segments, b->max_phys_segments); t->max_hw_segments = min(t->max_hw_segments, b->max_hw_segments); diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index a63161a..04e5fd7 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -668,7 +668,7 @@ static void check_for_valid_limits(struct io_restrictions *rs) if (!rs->max_segment_size) rs->max_segment_size = MAX_SEGMENT_SIZE; if (!rs->seg_boundary_mask) - rs->seg_boundary_mask = -1; + rs->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK; if (!rs->bounce_pfn) rs->bounce_pfn = -1; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index a135256..9573945 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -921,6 +921,8 @@ extern void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter); #define MAX_SEGMENT_SIZE 65536 +#define BLK_SEG_BOUNDARY_MASK 0xFFFFFFFFUL + #define blkdev_entry_to_request(entry) list_entry((entry), struct request, queuelist) static inline int queue_hardsect_size(struct request_queue *q) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/